<img height="1" width="1" style="display:none" src="https://www.facebook.com/tr?id=420156728350215&amp;ev=PageView&amp;noscript=1">

Building a Recursive BigQuery Mapper

Darren McCleary | The New York Times

ABOUT THE TALK

The New York Times Crossword is a cross-platform game that users can play anywhere. This means a puzzler can start a game on their phone on their bus ride to work in the morning, ponder over it on their work laptop during the day, then complete it on their tablet before bed. 
 
During our replatforming, we migrated our data from MySQL to GCP Datastore. This was a win from an application standpoint, but we gave up the ability to execute bespoke, complex SQL queries to gain insights into our users and how they interact with our Crossword product. With ever more detailed data available and over 300,000 paying subscribers, this was a temporarily necessary but painful ability to give up.
 
Therefore, we wrote a Go application that can do the job of replicating our data from Datastore into BigQuery orders of magnitude faster than alternative solutions. The application itself is recursive, able to scale itself to meet the needs of the data load it’s been given. Using this system, we were able to achieve over 1.5 million streaming inserts per second. Now we’re able to answer the questions everyone always asks about the NYT Crossword. How many people played the Mini today? Was it harder than last week's? What quantile does my time fall in? What word did solvers struggle with the most?

 

Download Slides

Darren McCleary

Software Engineer | The New York Times

Darren is a software engineer on The New York Times Games Team. He's spent the last year migrating the entire Games platform from LAMP to a microservice architecture in GCP with zero downtime for its 300,000 paying puzzlers. He earned his Bachelor's in Computer Science from Drexel University in 2015 before going to work for The Times.

darren-round.png