Große Datenmengen gibt es nicht nur bei Großunternehmen: Bewältigung von Massendaten für den Mittelstand Mittwoch 27. Februar 2013 14:00 18:00 Uhr Museum für Kommunikation Nürnberg Konferenzsaal 2. OG
. PARSTREAM OVERVIEW CHRISTIAN WERLING DIRECTOR SALES D/A/CH CHRISTIAN.WERLING@PARSTREAM.COM
Big Data is a Hype! 3
4
Die Erfindung hat so viele Mängel, dass es nicht ernsthaft als Kommunikationsmittel taugt. Das Ding hat für uns an sich keinen Wert. Memo der Western Union Financial Services zur Erfindung des Telefons, 1876 Die weltweite Nachfrage nach Kraftfahrzeugen wird eine Million nicht überschreiten - allein schon aus Mangel an verfügbaren Chauffeuren. Gottlieb Daimler, Erfinder, 1901 Das Pferd wird es immer geben, Automobile hingegen sind lediglich eine vorübergehende Modeerscheinung. Der Präsident der Michigan Savings Bank, 1903 5
Ich denke, dass es einen Weltmarkt für vielleicht fünf Computer gibt. Thomas Watson, CEO von IBM, 1943 Der Fernseher wird sich auf dem Markt nicht durchsetzen. Die Menschen werden sehr bald müde sein, jeden Abend auf eine Sperrholzkiste zu starren. Darryl F. Zanuck, Chef der Filmgesellschaft 20th Century- Fox, 1946 BIG DATA ist ein reiner Hype. Niemand braucht das. Max Mustermann, Daten- Experte" Nürnberg, 27. Februar 2013 6
What BigData NOT is! 7
Big Data is NOT Storage of large datasets! 8
Big Data is NOT BI + 20%! 9
What is ParStream? 10
ParStream is THE Big Data Analytics Platform ParStream enables Enterprises to exploit Big Data opportunities and beat the competition by speed of implementation and operation REAL-TIME LOW-LATENCY HIGH THROUGHPUT 11
PARSTREAM BIG DATA ANALYTICS PLATFORM 2008 13.000.000.000 * x100 * 500 = Challenge 12
PARSTREAM BIG DATA ANALYTICS PLATFORM ParStream Empowers People in All Industries to Capture New Business Opportunities Evolving with Big Data Analyze and Filter Billions of Records Query Data Structures with 1000 s of columns Get Answers in Milliseconds without Cubes Continuous Import Data with Low Latency Execute 1000 s of Concurrent Queries 13
ROADBLOCKS Established Databases Vendors can t Deliver Technical Solutions MapReduce Can t Deliver Results in Real-Time Established Database Architectures were not Designed for Big Data NoSQL approaches cannot deliver in real-time Extreme Performance can only be Achieved through Parallelization Supporting both Volume and Speed has been Unachievable Operational Data Volumes Complex Event Processing In-Memory DB OLTP Reporting Real-Time Lag Time < 1..10 milli sec 10..100 milli sec 1 sec Interactive Analytics Gigabyte Terabyte Petabyte 1..10 sec Batch Analytics 1..10 min (MapReduce) >10 min Big Data Volume 14
Many Applications HUGE MARKET OPPORTUNITY Big Data Analytics is a game changer in every industry and is a huge market opportunity All Industries ecommerce Services Social Networks Telco Finance Energy Oil and Gas Many More Facetted Search Web analytics SEOanalytics Online- Advertising Ad serving Profiling Targeting Customer attrition prevention Network monitoring Targeting Prepaid account mgmt Trend analysis Fraud detection Automatic trading Risk analysis Smart metering Smart grids Wind parks Mining Solar Panels Production Mining M2M Sensors Genetics Intelligence Weather 15
ENABLING KEY BUSINESS SCENARIOS Initial Focus on Scenarios Requiring a Unique Combination of REAL-TIME LOW LATENCY HIGH THROUGHPUT Search and Selection Real-Time Analytics Online- Processing ParStream enables new levels of search and online shopping satisfaction ParStream drives interactive analytics processes to gain insights faster ParStream responds automatically to large data streams 16
KEY SCENARIO: SEARCH AND SELECTION ParStream Enables New Levels of Search & Online Shopping Satisfaction Coface Services Changed from Oracle to ParStream Customer Success Story Coface Services stopped development with Oracle after 6 years with partial solution ParStream built the intended solution within 4 month running on a single small server Coface Services: very impressive results, we did not believe that ParStream will be able to deliver such a great solution Value Proposition Build great product search sites with stickiness and better conversion Fast ROI through increased sales revenue Target Market Large Online Shops Information marketplaces Social communities Travel Search Platform Information marketplace 17
CUSTOMER SUCCESS STORY: ETRACKER ParStream provides real-time campaign control and web analytics 1,000 to 12,000 times faster than MySQL-cluster Excellent Customers success stories Campaign Control & Web Analytics Continuous data import of new webclicks every few seconds 10 billion web-clicks of 100 days Continuous data import with maximum latency of 30 seconds Complex analytics for lifesegmentation of customer groups < 2 sec query response time for > 100 concurrent interactive user 20 server cluster, shared nothing Website clicks 50,000 domains 100 million rows continuous import per day ParStream 100.000.000.000 rows <2 sec response time Application Server Large aggregation multi-stage SQL-queries of many concurrent user 18
KEY SCENARIO: REAL-TIME ANALYTICS ParStream Drives Interactive Analytics Processes to Gain Insights Faster Customers Replace Existing Solutions to Profit from ParStream s Speed Customer Success Story Etracker discarded MySQL-cluster because ParStream is up to 12.000 times faster Searchmetrics chose ParStream because of efficiency that enabled international roll-out Rio-Tinto changed to ParStream because of speed to sustain competitive advantage Value Proposition Driving an interactive analytics process delivers insights quicker and more accurately High ROI through process innovation and greatly reduced infrastructure cost Target Market Ad-spending and Web-Analytics Profiling and targeting Advanced analytics Web-Analytics SEO-Analytics Geo-Spatial Analytics 19
ARCHITECTURE BUILDING BLOCKS ParStream is a Big Data Analytics Platform Based on a Unique High Performance Compressed Index Hybrid Columnar/Row Storage In Memory Technology Shared Nothing Architecture Standard Interfaces Unique High Performance Compressed Index SQL API / JDBC / ODBC In-Memory and Disc Technology Massively Parallel Processing (MPP) Real-Time Analytics Engine High Performance Compressed Index (HPCI)v C++ UDF - API Fast Hybrid Storage (Columnar/ Row) Multi-Dimensional Partitioning Shared Nothing Architecture High Speed Loader with Low Latency 20
HIGH PERFORMANCE COMPRESSED INDEX The Key to ParStream s Unmatched Performance STANDARD DB INDEX ARCHITECTURE High Memory Requirements High Load on CPUs Time for Decompression Not Suitable for Big Data Analytics PARSTREAM INDEX ARCHITECTURE + Low Memory Requirements + No Need for Decompression + Patent filing in process Engineered for Big Data Analytics 21
ParStream ORDERS OF MAGNITUDE FASTER ParStream Outperforms PostgreSQL by a Factor of 1000 Delivering Results in Sub-Seconds on Large Data Volumes Seconds Slow PostgreSQL (Scale 100 sec) Query 1 Query 2 Query 3 exponential PostGreSQ L 1000 times faster seconds 1, 0 0, 9 0, 8 0, 7 0, 6 0, 5 0, 4 ParStream (Scale 1 sec) Query 1 Linear Query 2 0, 3 Fast 0, 2 0, 1 0, 0 0 2 0 4 0 6 0 8 0 10 0 12 0 Query 3 14 0 Number of Rows [Millions] 16 0 18 0 22
ONE LICENSE FOUR WAYS TO DELIVER Customer Choice Software Cloud Partner Appliance v OEM, ISV, SI from software that can be configured and run on customer specific infrastructure to cloud and appliance 23
ParStream @ Mittelstand? 24
PARSTREAM @ MITTELSTAND Vorhandene Standard Hardware oder Betrieb in der Cloud Prototyp in max. 3-5 Tagen Geringer Schulungsaufwand Hello World in 2 Minuten Preisliche Skalierung ausschließlich anhand vom Datenvolumen 25
Data is the New Oil ParStream Provides the Platform Christian Werling, Director Sales ParStream GmbH Große Sandkaul 2 50667 Köln Christian.Werling@parstream.com