Sometimes SQL queries are difficult or impossible to optimize if you are using pure standard SQL.
I started this because I just wanted to find a way to save the result of a sub-query to use later. In my example I wanted to save the average + the standard deviation of a large set of values for some research. I found that Posgresql supports standard SQL only so you can not save query results. Instead you can use plpgsql with is a procedural langauge that can be used with Postgresql (psql).
These are the two tables being considered for these examples.
CREATE TABLE IF NOT EXISTS test
(
test_id integer NOT NULL primary key,
algorithm_data_id integer NOT NULL references
test_timestamp timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
test_status integer NOT NULL,
num_agents integer NOT NULL,
num_obstacles integer NOT NULL
) ;
CREATE TABLE IF NOT EXISTS composite1
benchmark_id integer NOT NULL references test(test_id) ON DELETE CASCADE,
collisions integer NOT NULL,
time double precision NOT NULL,
success integer NOT NULL,
num_unique_collisions integer NOT NULL,
total_distance_travelled double precision NOT NULL,
agent_complete integer NOT NULL,
agent_success integer NOT NULL,
optimal_path_length double precision NOT NULL,
);
Basic syntax for a function definition
CREATE OR REPLACE FUNCTION multiply(x real, y real) RETURNS real AS
CREATE OR REPLACE The replace part is useful for development if testing a function and updating it but also useful if it so happens that there already exist a function with the same name you want to override it to ensure proper functionality.After the AS comes the function definition. The most non obvious part that I found was the use of '
$$'. Think of the $$
as {} from many programming languages that are used to symbolize a body of code, in this case the function body.
CREATE OR REPLACE FUNCTION multiply(x real, y real) RETURNS real AS
$$<br />BEGIN<br /> RETURN x * y;<br />END;<br />$$
LANGUAGE plpgsql;
This is a simple example of a simple definition that can be used to multiply two numbers.
Note the uses of ';' they are only at the end of return statements and queries. This will be seen more later.
A more advanced example:
CREATE OR REPLACE FUNCTION getAVG_WRTTotalTicksforScenario(scenarioID INT) RETURNS double precision AS
\(BODY\)
DECLARE
average double precision;
BEGIN
SELECT AVG(total_total_ticks_accumulated/total_number_of_times_executed) INTO STRICT average
FROM ppr_ai_data pd2, test test, algorithm_data
WHERE test.algorithm_data_id = algorithm_data.algorithm_data_id and
-- test.scenario_group = scenario_group and
algorithm_data.algorithm_data_id = pd2.ppr_ai_id and
test.scenario_group = scenarioID;
RETURN average;
END
\(BODY\) LANGUAGE plpgsql;
Any variables that are going to be used should be defined in the declare section.
The return types of functions can be of the basic types for can be of a time defined for a table. For example
CREATE OR REPLACE FUNCTION getAllSignificantTestsForScenario(scenarioID INT, algorithmName TEXT) RETURNS SETOF Test AS
\(BODY\)
DECLARE
average double precision;
std_dev double precision;
r test%rowtype;
BEGIN
average := getAVG_WRTTotalTicksforScenario(scenarioID)
std_dev := getSTDDEV_WRTTotalTicksforScenario(scenarioID)
RAISE NOTICE 'Average is %', average;
RAISE NOTICE 'Standard deviation is %', std_dev;
RETURN QUERY SELECT * from test where test_id IN (
SELECT t.test_id
FROM test t, ppr_ai_data pd, algorithm al, algorithm_data Adata
WHERE t.algorithm_data_id = Adata.algorithm_data_id and
pd.ppr_ai_id = Adata.algorithm_data_id and
Adata.algorithm_type = al.algorithm_id and
al.name = algorithmName and
t.scenario_group = scenarioID and
pd.total_total_ticks_accumulated / pd.total_number_of_times_executed > (average + std_dev));
RETURN;
END
Return can be used a in few ways. In the above case it is used to return the result of an entire query. In the above example the return is typed to that of a row of the table test. A small trick was used to get around having to return a full test row. The use of Raise in this case is just used to inform the user of the values being used in the sub query. Usually RAISE is used to inform the user issues in the function and can be used to throw exception like many other programming languages [4].
References
- http://www.postgresql.org/docs/9.1/static/plpgsql-statements.html
- http://www.postgresql.org/docs/8.4/static/plpgsql-declarations.html
- http://www.postgresql.org/docs/9.1/static/plpgsql-control-structures.html#PLPGSQL-STATEMENTS-RETURNING
- http://www.postgresql.org/docs/9.1/static/plpgsql-errors-and-messages.html