Cursors in MySQL Stored Procedures
https://www.sitepoint.com/cursors-mysql-stored-procedures/
After my previous article on Stored Procedures was published on SitePoint, I received quite a number of comments. One of them suggested further elaboration on CURSOR, an important feature in Stored Procedures.
As cursors are a part of a Stored Procedure, we will elaborate a bit more on SP in this article as well. In particular, we will see how to return a dataset from an SP.
What is a CURSOR?
A cursor can’t be used by itself in MySQL. It is an essential component in stored procedures. I would be inclined to treat a cursor as a “pointer” in C/C++, or an iterator in PHP’s foreach
statement.
With cursors, we can traverse a dataset and manipulate each record to accomplish certain tasks. When such an operation on a record can also be done in the PHP layer, it saves data transfer amounts as we can just return the processed aggregation/statistical result back to the PHP layer (thus eliminating the select
– foreach
– manipulation process at the client side).
Since a cursor is implemented in a stored procedure, it has all the benefits (and limitations) of an SP (access control, pre-compiled, hard to debug, etc).
The official documentation on cursors is located here. It contains only four commands that are related to cursor declaration, opening, closing, and fetching. As mentioned above, we will also touch on some other stored procedure statements. Let’s get started.
A real world question
My personal website has a page showing the scores of my favorite NBA team: LA Lakers. The table structure behind it is straightforward:
Fig 1. The Lakers matches status table structure
I have been updating this table since 2008. Some of the latest records showing Lakers’ 2013-14 season are shown below:
Fig 2. The Lakers matches status table data (partial) for 2013-2014 season
(I am using MySQL Workbench as the GUI tool to manage my MySQL databases. You can use your favorite tool.)
Well, I have to admit Lakers are not playing very well these days. 6 consecutive losses up to Jan 15th. I get this “6 consecutive losses” by manually counting from the last played match all the way up (towards earlier games) and see how long an “L” (meaning a loss) in winlose
column can appear. This is certainly doable but if the requirement becomes more complicated in a larger table, it takes more time and is more error prone.
Can we do this with a single SQL statement? I am not an SQL expert and I haven’t been able to figure out how to achieve the desired result (“6 consecutive losses”) from one SQL statement. The input of gurus will be highly appreciated – leave it in the comments below.
Can we do this in PHP? Yes, of course. We can retrieve the game data (particularly, the winlose
column) for current season and do a traverse on the records to calculate the current longest win/lose streak. But to do that, we will have to grab all data for that year and most of the data will be wasted (as it is not likely for a team to have a win/lose streak for more than 20+ games in a 82-game regular season). However, we don’t know how many records should be retrieved into PHP to determine the streak, so this waste is a must. And finally, if the current win/lose streak is the only thing we want to know from that table, why pull all the raw data?
Can we do this via other means? Yes, it is possible. For example, we can create a redundant table specifically designed to store the current win/lose streak. Every insertion of the record will update that table too. But this is way too cumbersome and too error prone.
So, what is a better way to achieve this result?
Using Cursor in a Stored Procedure
As the name of this article suggests, we will see a better alternative (in my view) to solve this problem: using cursor in a Stored Procedure.
Let’s create the first SP in MySQL Workbench as follows:
DELIMITER $$
CREATE DEFINER=`root`@`localhost` PROCEDURE `streak`(in cur_year int, out longeststreak int, out status char(1))
BEGIN
declare current_win char(1);
declare current_streak int;
declare current_status char (1);
declare cur cursor for select winlose from lakers where year=cur_year and winlose<>'' order by id desc;
set current_streak=0;
open cur;
fetch cur into current_win;
set current_streak = current_streak +1;
start_loop: loop
fetch cur into current_status;
if current_status <> current_win then
leave start_loop;
else
set current_streak=current_streak+1;
end if;
end loop;
close cur;
select current_streak into longeststreak;
select current_win into `status`;
END
In this SP, we have one input parameter and two output parameters. This defines the signature of the SP.
In the SP body, we also declared a few local variables to hold the streak status (win or lose, current_win
), current streak and current win/lose status for a particular match.
declare cur cursor for select winlose from lakers where year=cur_year and winlose<>'' order by id desc;
The above line is the cursor declaration. We declared a cursor named cur
and the dataset bind to that cursor is the win/lose status for those matches played (thus its winlose
column is either “W” or “L” instead of nothing) in a particular year ordered by id
(the latest played games will have the highest ID) descending.
Though not displayed explicitly, we can imagine that this dataset will contain a series of “L”s and “W”s. Based on the data shown in Figure 2 above, it should be: “LLLLLLWLL…” (6 Ls, 1 Ws, etc).
To calculate the win/lose streak, we begin with the latest (and the first in the dataset) match data. When a cursor is opened, it always starts at the first record in the associated dataset.
After the first data is grabbed, the cursor will move to the next record. In this way, a cursor behaves very much like a queue, traversing the dataset in a FIFO (First In First Out) manner. This is exactly what we wanted.
After getting the current win/lose status and set the streak number, we continue to loop through (traverse) the remainder of the dataset. With each loop iteration, the cursor will “point” to the next record until we break the loop or all the records are consumed.
If the next win/lose status is the same as the current win/lose status, it means the streak goes on and we increase the streak number by 1 and continue the traversing; otherwise, it means the streak discontinues and we can leave the loop earlier.
Finally, we close the cursor and release the resources. Then we return the desired output.
Next, we can enhance the access control of the SP as described in my previous article.
To test the output of this SP, we will write a short PHP script:
<?php
$dbms = 'mysql';
$host = 'localhost';
$db = 'sitepoint';
$user = 'root';
$pass = 'your_pass_here';
$dsn = "$dbms:host=$host;dbname=$db";
$cn=new PDO($dsn, $user, $pass);
$cn->exec('call streak(2013, @longeststreak, @status)');
$res=$cn->query('select @longeststreak, @status')->fetchAll();
var_dump($res); //Dump the output here to get a raw view of the output
$win=$res[0]['@status']='L'?'Loss':'Win';
$streak=$res[0]['@longeststreak'];
echo "Lakers is now $streak consecutive $win.\n";
This will output something like the following figure:
(This output is based on Lakers’ match up to Jan 15th, 2014.)
Return a dataset from a Stored Procedure
A few discussions went along on how to return a dataset from an SP, which constructs the dataset out of the results from a few repeated calls to another SP.
A user may want to know more from our previously created SP that only returns a win/lose streak for one year; thus we can get a table showing the win/lose streaks for all the years in a form like:
YEAR | Win/Lose | Streak |
---|---|---|
2013 | L | 6 |
2012 | L | 4 |
2011 | L | 2 |
(Well, a more useful result can be to return the longest win streak and loss streak in a particular season. This requirement can be easily expanded from the previous SP so I will leave it to interested parties to implement. For the purpose of this article, we will stick to the current win/loss streak.)
MySQL SP can only return scalar results (an integer, a string, etc), unless the result is returned by a select ... from ...
statement (and it becomes a dataset). The issue here is that the table-form data we want to see does not exist in our current db structure and is constructed from another SP.
To tackle this, we need the help of a temporary table, or if situation allows and requires, a redundant table. Let’s see how we can achieve our target via a temporary table.
First, we will create a second SP as shown below:
DELIMITER $$
CREATE DEFINER=`root`@`%` PROCEDURE `yearly_streak`()
begin
declare cur_year, max_year, min_year int;
select max(year), min(year) from lakers into max_year, min_year;
DROP TEMPORARY TABLE IF EXISTS yearly_streak;
CREATE TEMPORARY TABLE yearly_streak (season int, streak int, win char(1));
set cur_year=max_year;
year_loop: loop
if cur_year<min_year then
leave year_loop;
end if;
call streak(cur_year, @l, @s);
insert into yearly_streak values (cur_year, @l, @s);
set cur_year=cur_year-1;
end loop;
select * from yearly_streak;
DROP TEMPORARY TABLE IF EXISTS yearly_streak;
END
A few key things to notice here:
- We determine the max year and min year by selecting from the table
lakers
; - We created a temp table to hold the output, with the structure requested by the output (
season
,streak
,win
); - In the loop, we first execute our previously created SP, with necessary parameters (
call streak(cur_year, @l, @s);
), then grab the data returned and insert into the temp table (insert into yearly_streak values (cur_year, @l, @s);
). - Finally, we select from the temp table and return the dataset, then do some cleaning (
DROP TEMPORARY TABLE IF EXISTS yearly_streak;
).
To get the results, we create another short PHP script as shown below:
<?php
... // Here goes the db connection parameters
$cn=new PDO($dsn, $user, $pass);
$res=$cn->query('call yearly_streak')->fetchAll();
foreach ($res as $r)
{
echo sprintf("In year %d, the longest W/L streaks is %d %s\n", $r['season'], $r['streak'], $r['win']);
}
And the display will be like:
Please note that the above is a bit different from calling our first SP.
The first SP does not return a dataset but only two parameters. In that case, we use PDO exec
then query
to fetch the output; while in the second SP, we returned a dataset from the SP, so we use PDO query
directly to invoke the call to the SP.
Voila! We did it!
Conclusion
In this article, we dug further into MySQL stored procedures and took a look at the cursor functionality. We have demonstrated how to fetch scalar data by output parameters (defined as out var_name vartype
in the SP declaration) and also fetch a calculated dataset via a temp table. During this process, a few statements otherwise used in stored procedures also surfaced.
The official documentation on the syntax of stored procedure and various statements can be found on the MySQL website. To create a stored procedure, please refer to these documents and to understand the statements, please see here.
Feel free to comment and let us know your thoughts!