In part 1 of this release blog series2020欧洲杯投注备用 we introduced the latest version of the which enables you to connect to Kubernetes and OpenShift. On top of that a brand new to support even more interesting algorithms from the world of machine learning and deep learning! Over the past few months, our customers’ data scientists have asked for various new algorithms and use cases they wanted to tackle with DLTK. The four new examples below are a subset of those and should also be helpful starting points for others.
DASK for Distributing Machine Learning Workloads
2020欧洲杯投注备用Machine learning workloads can easily consume more compute time, especially when it comes to larger or more complex datasets. It’s no secret that distributing such workloads helps speed up training times or tasks like . In the Python ecosystem, provides advanced parallelism for analytics, enabling performance at scale. Dask also provides some distributed machine learning algorithms via . The example below shows how a parallel implementation of K-Means can be easily integrated into Splunk using the Deep Learning Toolkit and developed and monitored in Jupyter Lab.
Device Agnostic PyTorch Example for CPU and GPU
2020欧洲杯投注备用When you connect the Deep Learning Toolkit to a GPU enabled Docker or Kubernetes environment, you can accelerate model training significantly. Benchmarking the dataset from the , we achieved a speedup of over 40x when we ran the example neural network classifier on a GPU compared to the CPU baseline. To put this into perspective: a training job that took over 30 minutes on CPU was cut down to a total of 45 seconds including data transfer overhead on a GPU. That’s pretty useful for much more agile data science iterations and accelerating model creations.
Luckily, PyTorch easily allows you to write that runs both on CPU and GPU using the .to(device) magic with minimal impact on your model code. We have added examples that show this functionality for a simple multiclass neural network classifier to get you started quickly.
Forecasting with Prophet
Built by team, based on an additive model where non-linear trends are fit with annual, weekly, and daily seasonality, plus holiday effects. It works best with time series that have strong seasonal effects and several seasons of historical data. Prophet is robust to missing data and shifts in the trend and typically handles outliers well.
Despite the fact that the forecast (green line) on the dashboard above is far from perfect, it can definitely serve as an example to get started quickly with experimentation. However, it also clearly shows that not every time series dataset is perfectly suited for Prophet, so don’t forget to check other robust forecasting methods like the in , which can be easily applied with the .
Graph Analysis with NetworkX
You may have read about the latest possibilities for graph analytics in Splunk using the freely available app from splunkbase. My colleague Greg recently published two articles on how those techniques can be used for understanding and baselining network behavior.
2020欧洲杯投注备用If it comes down to quickly developing code or experimenting with graph models, the graph analysis example in should help you get started quickly and explore more advanced modelling techniques with graphs.
Big Thanks to the Community
Recently a DLTK user in Japan built an extension to be able to apply the library on Japanese Language text and to make the NLP example work for Japanese. Luckily we were able to get his contribution merged into the DLTK 3.1 release. I’m really happy to see this community mindset and I want to thank you, for your contribution, ありがとうございました!
Last but not least I would like to thank so many colleagues and contributors who have helped me finish this release. A special thanks again to Anthony, Greg, Pierre and especially Robert for his continued support on DLTK and making Kubernetes a reality today!
With the and the recently opened 'Call For Papers' I want to encourage you to by May 20. Let me know in case you have any questions!