November 14, 2017 | SAN FRANCISCO
InfluxDays San Francisco 2017 was our inaugural event and we all had a blast!
In case you missed it, here is a list of the conference speakers’ talks.
Co-Founder and CTO InfluxData
IFQL: INFLUX FUNCTIONAL QUERY LANGUAGE – BRINGING DATA SCIENCE AND ANALYTICS INTO THE DATABASE
Introduction of the new query language that we’re building into InfluxDB and the InfluxData platform. Its design is functional and heavily inspired by projects like Pandas in Python and the Tidyverse projects in R. In addition to providing more complex query functionality, the language will facilitate more analytics and data science workloads within the database. Clustering on time series matrices, similarity metrics and k-nearest neighbors, forecasting models, and other data science tasks may become simple query operators within the query language. This talk will introduce the data model, some of the functions and walk through a working prototype implementation that showcases functionality unavailable in the language.
About Paul Dix: Paul Dix is the creator of InfluxDB. He has helped build software for startups, large companies and organizations like Microsoft, Google, McAfee, Thomson Reuters, and Air Force Space Command. He is the series editor for Addison Wesley’s Data & Analytics book and video series. In 2010 Paul wrote the book Service-Oriented Design with Ruby and Rails for Addison Wesley’s. In 2009 he started the NYC Machine Learning Meetup, which now has over 7,000 members. Paul holds a degree in computer science from Columbia University.
Co-Founder and CTO Honeycomb.io
OBSERVABILITY: IT’S NOT JUST AN OPS THING
Instrumentation and monitoring often get lumped under the “ops” umbrella, but it’s just as critical for the software engineer. Answering questions about what your code is doing can be just as important during the development process as it is after you’ve shipped. Let’s talk about how this plays out in the real world by using Honeycomb as a case study: relying on system visibility to inform planning during the development process, observing new changes during and after release, and (of course) debugging. We’ll discuss specific instrumentation practices that we love and have used successfully to gain visibility into our system’s day-to-day operations, and how those techniques have evolved over time.
About Christine Yen: Christine Yen is co-founder and CTO at Honeycomb. She has built systems and products at companies large and small and likes to have her fingers in as many pies as possible. Previously, she built Parse’s analytics product and wrote software at a few now-defunct startups.
Director, Platform Services Grafana Labs
DATA VISUALIZATION & ALERTING WITH GRAFANA
Grafana is the leading graph and dashboard builder for visualizing time series, which is a great tool for visual monitoring of InfluxData. This session will provide an intro to Grafana and talk about adding data sources, creating dashboards and getting the most out of your data visualization. The talk will look into some new features Grafana has to offer, as well as explain why different graphs are important and specifically how you can use them to analyze data performance and troubleshoot operational issues.
About Dan Cech: Dan Cech has 12+ years of experience developing high performance web applications and back-end systems that power internet companies worldwide. When not coding he can usually be found in the garage taking things apart (and sometimes putting them back together).
JARED P. LANDER
Chief Data Scientist Lander Analytics
MODELING TIME SERIES IN R
Temporal data is being produced in ever greater quantity so it is fortunate that our ability to analyze that data with time series methods has kept pace. During this talk, we look at a number of different techniques for modeling time series data. We start with traditional methods such as ARMA then go over more modern tools such as Prophet and even machine learning models like XGBoost. Along the way, we look at a bit of theory and the code for training these models.
About Jared P. Lander: Jared P. Lander is the Chief Data Scientist of Lander Analytics, a data science and artificial intelligence consulting and training firm based in New York City; the organizer of the New York Open Statistical Programming Meetup—the world’s largest R meetup—–and the New York R Conference; author of R for Everyone and an adjunct professor at Columbia University. With an M.A. from Columbia University in statistics and a B.S. from Muhlenberg College in mathematics, he has experience in both academic research and industry. Very active in the data community, Jared is a frequent speaker at conferences, universities and meetups around the world. His writings on statistics can be found at jaredlander.com and his work has been featured in publications such as Forbes and the Wall Street Journal.
Front End Engineer Honeycomb.io
As client-side app frameworks like React keep growing more popular, we’re shipping more and more application logic out to users’ browsers. But we don’t always know much about what happens to it after we send it out to the client. This talk will take you on a fast-paced tour of all the strange cases we’ve seen in browsers in the wild, from overseas proxy sites to rogue browser extensions to console-hacking customers with a sense of humor. Finally, we’ll talk about how to cut the noise and focus on minimum viable instrumentation to have visibility into the things that really matter to your users’ experience.
Senior Software Engineer Sidewalk Labs
THE DYGRAPHS CHARTING LIBRARY
About Dan Vanderkam: Prior to joining Sidewalk, Dan Vanderkam worked at Mt. Sinai’s HammerLab building open source tools for visualizing genomes and detecting genetic variants in cancers. Before Mt. Sinai, Dan spent eight years at Google, primarily working on projects involving web search and search logs analysis. He was involved in creating quick answers and visualizations for queries such as “when does the sun set” or “how many people live in NYC” Dan was also part of the team that built Google Flu Trends and Google Correlate, a tool for discovering patterns in search activity. Dan has a long history of developing a variety of digital side projects. The most recent of which was OldNYC, a collaboration with the New York Public Library which placed tens of thousands of historic photos on a map. Dan is a native of South Bend, Indiana. He studied Math and Computer Science at Rice University. Since 2010, he has been based in Brooklyn, where he can frequently be found cycling, climbing, running and throwing frisbees.
WHAT GOOD IS ANOMALY DETECTION?
Static thresholds on metrics have been falling out of fashion for a while, and for good reason. Modern tooling lets you analyze and monitor a lot more data points than you used to be able to, resulting in lots more noise. The hope is that anomaly detection answers some of this, by replacing static thresholds (anomalies) with dynamic ones. But it doesn’t work as well as most people think it will. In this talk, I’ll explain how anomaly detection works, so you can understand why it isn’t a good general-purpose solution, and which specific cases it’s good at.
About Baron Schwartz: Baron Schwartz is a widely-recognized expert on database internals, web performance, and large-scale application development and is known for his contributions to the various database communities like MySQL, PostgreSQL, Redis and MongoDB. His best-selling technical books and open-source software are used by tens of thousands of engineers every day. Before founding VividCortex, Baron was an executive at Percona and he has a degree in Computer Science from the University of Virginia.