Chris is a Databricks Certified Spark Developer with over 25 years of experience in project areas ranging from big data pipelines using Kafka, Elasticsearch, Spark and Scala, to Spring/Hibernate/JMS-based enterprise applications, remote device management and personalized content & ad delivery. He has co-authored patents in the areas of stream-based data processing and personalization techniques for mobile devices, and writes frequently about Spark and software development in general (with articles available on Dzone, Java World, as well as his own blog site http://datalackey.com.) Chris earned his BS in Computer Science from Yale University, and a Certificate with Distinction in Business Administration from UC Berkeley extension.
With leap years and varying rules by Locale, time processing has proven tricky to get right on the JVM platform. This session will provide a very quick refresher of Unix epoch-based time keeping, time string formats and Locales, and then move to a discussion of time and date processing functions in Spark SQL, and the particular problems associated with making time window-based aggregations work as expected. As a fairly advanced session we will assume familiarity with Spark and will not go over any aspects of its architecture.