Fast Data is the New Black: Spark Summit East 2016 Conference

Spark Summit East 2016 was held last week in New York City and hosted over 1,300 attendees representing 500+ companies. As Spark’s popularity continues to surge, the unsurprisingly sold out venue delivered on great content through the use cases of Netflix, Bloomberg, Comcast, Capital One, and The Weather Company. Participants looked behind the curtain of Spark 2.0 as Databricks Co-Founder & CTO, Matei Zaharia, shared nuggets of insight on what’s to come and set precedent to a new buzzword, “continuous applications.”

Organizations are growing increasingly impatient and want access to data that is being produced in real-time. Business users want insight faster. On one extreme, not responding to customer’s needs and feedback immediately can have serious business consequences – think brand reputation impact from social sources like Twitter and Facebook when things get sour. Additionally, consumers expect to have data and information right at their fingertips and hold organizations to the same standard.

Spark is an answer to this business problem and consumer’s growing demand for immediate access to information and real-time customer service. Spark enables businesses to steer away from a batched architecture and achieve those ‘light-bulb’ moments quicker and in real-time. This is fast data.

Yet, there is still much reservation about adopting Spark and fast data for the enterprise. Spark is relatively new and while it is one of the most heavily contributed platforms in the Apache Software Foundation, most contributions are coming from Databricks, which has a huge financial interest in the technology.

One of the most attended and well-received sessions at Spark Summit East was the “Top 5 mistakes when writing Spark applications” session which showcased the limitations on the physical architecture, performance issues related to larger data sets, and issues related to joining disparate data. These mistakes demonstrate the fact that Spark is still relatively early in maturity as organizations test the boundaries and limitations of Spark’s capabilities.

Despite Spark’s maturity, organizations and the community are pushing forward with the technology, which is a leading indicator on the long-term viability of the technology as an enterprise product. You know you have a great product when many early adopters absorb battle scars and still take up their sword for the next battle. As such, Databricks announced the Community Edition Beta Program which is a free version of their cloud-based fast data platform.

The media recently referred to Spark as “arguably the hottest big data technology of the year – or maybe ever.” So long as Spark remains attentive to the big data marketplace, incorporates feedback from real production use cases, remains open to the community for continuous improvement and continues to innovate, the technology will remain a serious big data contender and go-to platform of choice for fast data.

We believe that Spark can play an integral part of the technology stack of every innovative organization. Is your organization ready for fast data?

By Ryan Daly