Contrado Digital

MySQL Recursive Queries – MySQL While Loops – Fill Zero Sum Dates Between Dates

So, where do I start with this topic. It’s complex…

I didn’t really know what to title this blog post as, since it’s complex.

Specifically my personal problem at hand was to look at how to prevent missing dates between two dates when needing to report on basic challenges such as Show Number Of X Between Two Dates. And since the basic queries such as SELECT COUNT(*) FROM table_x GROUP BY my_date_field; tends to work absolutely fine in scenarios where things are happening daily, it dramatically fails when things often happen over longer time frames. Instead, it’s important that the actual dates between two dates are the primary axis on reporting on this data.

Quite surprisingly, this seems to have been a fairly challenging thing to achieve prior to MySQL 8, so this blog post is purely going to look at MySQL 8 and beyond for how to achieve this.

MySQL isn’t really designed to work with while loops. But in every modern programming language while loops are simple.

i.e.

while(x = true){
	doSomething();
}

Etc.

Great, then let’s wrap the complex while loops in the code, and leave MySQL for the basics of data retrieval.

Yeah… but there is a thing called performance, and that doesn’t really work on large scale data sets with millions of records…

It’s imperative that the data access complexities are pushed as low into the tech stack as possible to improve efficiency, on so many levels. Aka. Don’t put things in the code layer that would be better handled at the data layer, aka. MySQL.

So, historically, if you wanted to perform a while loop in a MySQL query you would probably end up resorting to a Stored Procedure in one way or another. Sorry DABs (aka. Database Administrators) but when software requires the use of Sored Procedures then there is probably something fundamentally wrong under the hood.

 

Basic While Loop in MySQL using WITH RECURSIVE

So, let’s get back to basics. How do we do a basic loop in MySQL 8. How do I count from 1 to 10 using pure SQL?

Turns out, it’s “pretty simple” (as everything is when you know how) but the syntax is a bit of a challenge. So let’s dig into that next.

 

How to Count from 1 to 10 in MySQL 8 Using Recursive Queries

The simple solution to this is as follows;

WITH RECURSIVE myRecursiveExpressionName(iCanCount) as (
		SELECT 1
UNION ALL
		SELECT 
			iCanCount + 1
		FROM 
			myRecursiveExpressionName 
		WHERE 
			iCanCount < 10
)
SELECT 
	* 
FROM 
	myRecursiveExpressionName 
ORDER BY 
	iCanCount ASC
;

Which will produce a nice output as follows when running the query;

  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10

Great. We can do some simple sequential “stuff”.

 

How to Use MySQL to Get the Dates Between Two Dates for Reporting

So, now it’s time to put this into practice. Let’s get a list of dates that we can use that are the definitive list of dates between two dates, such as the kind of thing that you would use in a situation that is measuring data and performance  and trends between two date periods.

WITH RECURSIVE allDatesBetweenTwoDates(myRecursiveExpressionName) as (
		SELECT '2023-10-01'
UNION ALL
		SELECT 
			myRecursiveExpressionName + INTERVAL 1 DAY 
		FROM 
			allDatesBetweenTwoDates 
		WHERE 
			myRecursiveExpressionName < '2023-10-12'
)
SELECT 
	* 
FROM 
	allDatesBetweenTwoDates 
ORDER BY 
	myRecursiveExpressionName ASC
;

Query Syntax

OK, this is all well and good with the examples. But what does it mean above? It’s fairly new syntax with MySQL and it isn’t easy to understand what it is actually doing. And the official documentation can be a tad difficult to interpret.

So let’s dig into this in a little more detail for what this all means in practice, I’ve added a few comments to the iCanCount example which helps explain what is going on…

-- MySQL While Loop
-- Think about this whole section with the WITH RECURSIVE bit as kind of like an in-memory virtual table with one column, called ‘allDatesBetweenTwoDates
WITH RECURSIVE myRecursiveExpressionName(allDatesBetweenTwoDates) as (
    		-- Non-Recursive Select Part...
    		-- Return initial row set
    		-- aka. Start Date
    		SELECT '2023-10-01'
UNION ALL
    		-- Recursive Select Part...
    		-- Return additional row sets
    		-- aka. End Date, with a stop expression via the WHERE clause
    		SELECT 
    			allDatesBetweenTwoDates + INTERVAL 1 DAY
    		FROM 
    			myRecursiveExpressionName 
    		WHERE 
    			allDatesBetweenTwoDates < '2023-10-12'
)
SELECT 
	allDatesBetweenTwoDates
FROM 
	myRecursiveExpressionName 
ORDER BY 
	allDatesBetweenTwoDates ASC
;

Which then nicely produces the a row for every date between those two dates;

  1. 2023-10-01
  2. 2023-10-02
  3. 2023-10-03
  4. 2023-10-04
  5. 2023-10-05
  6. 2023-10-06
  7. 2023-10-07
  8. 2023-10-08
  9. 2023-10-09
  10. 2023-10-10
  11. 2023-10-11
  12. 2023-10-12

 

Brilliant. Next it’s just back to your good old easy MySQL stuff to the data you want on those dates to join to the table of your choice to get the additional data you need. i.e. as a simple example;

WITH RECURSIVE myRecursiveExpressionName(allDatesBetweenTwoDates) as (
    		SELECT '2023-09-06'
UNION ALL
SELECT 
		allDatesBetweenTwoDates + INTERVAL 1 DAY
FROM 
		myRecursiveExpressionName 
WHERE 
allDatesBetweenTwoDates < '2023-10-06'
)
SELECT 
allDatesBetweenTwoDates
    	, IFNULL(SUM(my_table.my_summable_column), 0) -- Important to add the IFNULL check here to avoid NULLs returning as you can’t graph a NULL value
FROM 
	myRecursiveExpressionName 
    	LEFT JOIN my_table ON myRecursiveExpressionName.allDatesBetweenTwoDates = my_table.created_date
GROUP BY 
	myRecursiveExpressionName.allDatesBetweenTwoDates
ORDER BY 
	allDatesBetweenTwoDates ASC;

I have to mention though, while that last bit looks easy – this is going to be heavily dependent on your data, data quality and data relationships to get the exact data you need. This bit soon gets tricky when you have to include a WHERE statement, since the second you introduce a WHERE to filter data such as my_table.some_column_you_want_to_filter_on, then this will instantly get you back to the starting point as it removes all records where you just got a NULL/0 value, which is essentially the same problem at the start when trying to run;

SELECT my_date, COUNT(*) FROM my_table GROUP BY my_date;

Which fails miserably when there are missing records on certain dates. It’s surprising all of this is needed in this day and age. If only MySQL could create some kind of “WITH FILL MISSING DATES BETWEEN(fromDate, toDate)” kind of function/syntax to abstract all this kind of workings.

The following two tabs change content below.

Michael Cropper

Founder & Managing Director at Contrado Digital Ltd
Michael has been running Contrado Digital for over 10 years and has over 15 years experience working across the full range of disciplines including IT, Tech, Software Development, Digital Marketing, Analytics, SaaS, Startups, Organisational and Systems Thinking, DevOps, Project Management, Multi-Cloud, Digital and Technology Innovation and always with a business and commercial focus. He has a wealth of experience working with national and multi-national brands in a wide range of industries, across a wide range of specialisms, helping them achieve awesome results. Digital transformation, performance and collaboration are at the heart of everything Michael does.
Exit mobile version