Distributed Systems Conference
16 February 2019 | Pune
Milind Bhandarkar was the founding member of the team at Yahoo that took Apache Hadoop from a 20-node prototype to datacenter-scale production system. Parallel programming languages and paradigms has been his area of focus for over 20 years. He's worked at the C-DAC, National Center for Supercomputing Applications (NCSA), Center for Simulation of Advanced Rockets, Siebel Systems, Pathscale, Yahoo, LinkedIn, and Greenplum. Prior to founding Ampool, Milind was the Chief Scientist at Pivotal. Milind holds his Ph.D. degree in Computer Science from the University of Illinois at Urbana-Champaign.
Talk: Architecting Modern Data Platforms for Hybrid Clouds
Dr. Neil J. Gunther
Neil Gunther is a computer information systems researcher best known internationally for developing the open-source performance modeling software Pretty Damn Quick and developing the Guerrilla approach to computer capacity planning and performance analysis. He has also been cited for his contributions to the theory of large transients in computer systems and packet networks, and his universal law of computational scalability.
Talk: Applying The Universal Scalability Law to Distributed Systems
Dr. Sriram Srinivasan
Dr. Srinivasan is a systems and programming languages geek with 30+ years of experience, from embedded devices to large-scale distributed systems and frameworks. He was one of the principal designers and implementors of the Weblogic application server. He has a PhD from the Univ. of Cambridge and teaches Distributed Systems at IIT Bombay.
Talk: Knowledge Logic: Applications to Distributed Systems and Life in General
Somya Maithani is currently a backend developer (SDE II) at Helpshift for the last three and a half years. She loves solving logical problems, especially the ones that are challenging. In her spare time, she mostly reads books and watch series. She loves to bake and have just started dabling in that.
Talk: Combating Entropy in a Highly Distributed System
Tapasweni as Senior Software Engineer - Manager at Reliance Jio Financial Innovation Group and focuses on making services available on distributed systems. She likes keeping herself involved with open source projects and communities. Previously, she has worked with MApbox, SAP, Qualcomm in their engineering team. Tapasweni contributes to different open source projects and maintains few, she also likes to write useful content all over the web through different mediums.
Talk: Building Scalable Distributed Tracing and Monitoring Systems
Architecting Modern Data Platforms for Hybrid Clouds
In the last decade, since the emergence of public clouds, a hard boundary has remained between public clouds and on-premises infrastructures and services. With Azure Stack, GKE, VMWare on AWS, and recent announcements about AWS Outposts, it is clear that the line between public clouds and on-premises infrastructures and services is blurring. Recent developments in the industry, such as merger between Hadoop rivals Cloudera & Hortonworks, as well as IBM's acquisition of RedHat, indicate a trend that an exciting hybrid cloud future awaits us. Public clouds entering on-premises means same logically centralized control planes (and associated managed services) will be available on both public clouds, and on-premises, making hybrid data planes possible. In addition, software systems deployment targets, which until recently were limited to bare metal and heavyweight virtual machines, have proliferated. They now include a hierarchy: Bare metal physical machines, Virtual Machines, MicroVMs, Containers, Isolates, & Functions. This fundamentally changes how distributed data platforms and data-intensive applications will be developed. This talk outlines the architectural building blocks that can be used in specific design patterns for developing modern distributed data platforms. We will draw from our experiences in developing a prominent distributed data analytics platform, Apache Hadoop, and outline how such a platform could be built today with modern building blocks. We intend to cover most aspects of the platform control planes such as high availability, disaster recovery, security, resource scheduling, orchestration, resource isolation, allocation, management, monitoring, scaling, metrics, metering, and logging. I will illustrate this architectural paradigm shift with some of the design choices we have made at Ampool, a modern data analytics platform.
Applying The Universal Scalability Law to Distributed Systems
When I originally developed the Universal Scalability Law (USL), it was in the context of tightly-coupled Unix multiprocessors, which led to an inherent dependency between the serial contention term and the data consistency term in the USL, i.e., no contention, no coherency penalty. Later, I realized that the USL could have broader applicability to large-scale clusters if this dependency was removed. In this talk I will show examples of how the USL can be applied as a statistical regression model to a variety of large-scale distributed systems, such as, Hadoop, Zookeeper, Sirius, AWS cloud, and Avalanche DLT, in order to quantify their scalability in terms of numerical concurrency, contention, and coherency.
Dr. Neil J. Gunther
Combating Entropy in a Highly Distributed System
In this talk, we’ll go over the case study of Helpshift as a massively distributed architecture that sees 160,000 requests per second and cover topics with respect to problems that arise due to entropy. Topics will include, what data inconsistencies are in real world, Helpshift’s distributed architecture, initial attempts and why did they not work? We’lI also talk about isolating the root cause which results in data inaccuracies, the solution and mitigating unsolvable side effects of distributed architectures.
Building Scalable Distributed Tracing and Monitoring Systems
In large scale customer focused distributed systems with asynchronous programming, tracing and monitoring becomes hard. As system scales with higher request counts for the product with multiple projects involved in a single pipeline, having proper logging, tracing, monitoring and metric system setup for the product becomes imperative. In this talk I will be speaking about comparing distributed monitoring, observability, and performance packages like Kamon, Zipkin, Prometheus and the best practices to build a scalable distributed tracing and monitoring system independent on your code style of the project.
Knowledge Logic: Applications to Distributed Systems and Life in General
Knowledge Logic is a system of logic that provides a framework for what it means to "know" something, or how to deduce "I know that you know that I know". This has far-reaching implications for everything from distributed systems to politics to navigating traffic. This talk is a gentle introduction to the topic, and will leave you with the knowhow to topple dicators.
Dr. Sriram Srinivasan
Hotel Novotel, Pune, Maharashtra