Skip to main content Accessibility help
×
Hostname: page-component-77c89778f8-gq7q9 Total loading time: 0 Render date: 2024-07-20T15:26:56.794Z Has data issue: false hasContentIssue false

13 - Conclusion

from Part VI - Closing notes

Published online by Cambridge University Press:  05 March 2014

Henrique C. M. Andrade
Affiliation:
J. P. Morgan
Buğra Gedik
Affiliation:
Bilkent University, Ankara
Deepak S. Turaga
Affiliation:
IBM Thomas J. Watson Research Center, New York
Get access

Summary

Stream processing has emerged from the confluence of advances in data management, parallel and distributed computing, signal processing, statistics, data mining, and optimization theory.

Stream processing is an intuitive computing paradigm where data is consumed as it is generated, computation is performed at wire speed, and results are immediately produced, all within a continuous cycle. The rise of this computing paradigm was the result of the need to support a new class of applications. These analytic-centric applications are focused on extracting intelligence from large quantities of continuously generated data, to provide faster, online, and real-time results. These applications span multiple domains, including environment and infrastructure monitoring, manufacturing, finance, healthcare, telecommunications, physical and cyber security, and, finally, large-scale scientific and experimental research.

In this book, we have discussed the emergence of stream processing and the three pillars that sustain it: the programming paradigm, the software infrastructure, and the analytics, which together enable the development of large-scale high-performance SPAs.

In this chapter, we start with a quick recap of the book (Section 13.1), then look at the existing challenges and open problems in stream processing (Section 13.2), and end with a discussion on how this technology may evolve in the coming years (Section 13.3).

Book summary

In the two introductory chapters (Chapters 1 and 2) of the book, we traced the origins of stream processing as well as provided an overview of its technical fundamentals, and a description of the technological landscape in the area of continuous data processing.

Type
Chapter
Information
Fundamentals of Stream Processing
Application Design, Systems, and Analytics
, pp. 487 - 499
Publisher: Cambridge University Press
Print publication year: 2014

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

[1] Mathworks MATLAB; retrieved in April 2012. http://www.mathworks.com/.
[2] Palm, W. Introduction to MATLAB for Engineers. McGraw Hill; 2010.Google Scholar
[3] Wolfram Research – Mathematica; retrieved in April 2012. http://www.wolfram.com/.
[4] Wolfram, S. The Mathematica Book – Version 4. Cambridge University Press; 1999.Google Scholar
[5] The R Project for Statistical Computing; retrieved in April 2012. http://www.r-project.org/.
[6] Everitt, B. A Handbook of Statistical Analyses Using R. 2nd edn. Chapman & Hall and CRC Press; 2009.Google Scholar
[7] Simplified Wrapper and Interface Generator; retrieved in April 2012. http://www.swig.org/.
[8] Cohn, R, Russel, J. SWIG. VSD; 2012.Google Scholar
[9] Gordon, R. Essential JNI: Java Native Interface. Prentice Hall; 1998.Google Scholar
[10] Press, WH, Flannery, BP, Teukolsky, SA, Vetterling, WT. Numerical Recipes: The Art of Scientific Computing. Cambridge University Press; 1992.Google Scholar
[11] Streams Exchange; retrieved in May 2012. https://www.ibm.com/developer works/mydeveloperworks/groups/service/html/communityview?communityUuid=d4e7dc8d-0efb-44ff-9a82-897202a3021e.
[12] Christensen, E, Curbera, F, Meredith, G, Weerawarana, S. Web Services Description Language (WSDL) 1.1. World Wide Web Consortium (W3C); 2001. http://www.w3.org/TR/wsd1.Google Scholar
[13] Apache Maven; retrieved in April 2012. http://maven.apache.org/.
[14] Sun Microsystems. RPC: Remote Procedure Call Protocol Specification Version 2. The Internet Engineering Task Force (IETF); 1988. RFC 1050.
[15] Gropp, W, Lusk, E, Skjellum, A. Using MPI: Portable Parallel Programming with Message-Passing Interface. MIT Press; 1999.Google Scholar
[16] Booth, D, Haas, H, McCabe, F, Newcomer, E, Champion, M, Ferris, C, et al.Web Services Architecture – W3C Working Group Note. World Wide Web Consortium (W3C); 2004. http://www.w3.org/TR/ws-arch/.Google Scholar
[17] The Object Management Group (OMG), Corba; retrieved in September 2010. http://www.corba.org/.
[18] Schneider, S, Andrade, H, Gedik, B, Biem, A, Wu, KL. Elastic scaling of data parallel operators in stream processing. In: Proceedings of the IEEE International Conference on Parallel and Distributed Processing Systems (IPDPS); 2009. pp. 1–12.Google Scholar
[19] Wolf, J, Bansal, N, Hildrum, K, Parekh, S, Rajan, D, Wagle, R, et al. SODA: An optimizing scheduler for large-scale stream-based distributed computer systems. In: Proceedings of the ACM/IFIP/USENIX International Middleware Conference (Middleware). Leuven, Belgium; 2008. p. 306–325.Google Scholar
[20] Wolf, J, Khandekar, R, Hildrum, K, Parekh, S, Rajan, D, Wu, KL, et al. COLA: Optimizing stream processing applications via graph partitioning. In: Proceedings of the ACM/I-FIP/USENIX International Middleware Conference (Middleware). Urbana, IL; 2009. pp. 308–327.Google Scholar
[21] Stoica, I, Morris, R, Karger, D, Kaashoek, F, Hari, . Chord: A scalable peer-to-peer lookup protocol for internet applications. In: Proceedings of the ACM International Conference on the Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM). San Diego, CA; 2001. p. 149–160.Google Scholar
[22] memcached – A Distributed Memory Object Caching System; retrieved in April 2012. http://memcached.org/.
[23] Losa, G, Kumar, V, Andrade, H, Gedik, B, Hirzel, M, Soulé, R, et al. Language and system support for eficient state sharing in distributed stream processing systems. In: Proceedings of the ACM International Conference on Distributed Event Based Systems (DEBS). Berlin, Germany; 2012.Google Scholar
[24] Fox, A, Gribble, SD, Chawathe, Y, Brewer, EA, Gauthier, P. Cluster-based scalable network services. In: Proceedings of Symposium on Operating System Principles (SOSP). Saint Malo, France; 1997. pp. 78–91.Google Scholar
[25] Shen, K, Yang, T, Chu, L. Clustering support and replication management for scalable network services. IEEE Transactions on Parallel and Distributed Systems (TPDS). 2003;14(11):1168–1179.Google Scholar
[26] Gray, J. Transaction Processing: Concepts and Techniques. Morgan Kaufmann; 1992.Google Scholar
[27] Beynon, M, Ferreira, R, Kurc, T, Sussman, A, Saltz, J. DataCutter: middleware for filtering very large scientific datasets on archival storage systems. In: Proceedings of the IEEE Symposium on Mass Storage Systems (MSS). College Park, MD; 2000. pp. 119–134.Google Scholar
[28] Ferreira, R, Moon, B, Humphries, J, Sussman, A, Miller, R, DeMarzo, A. The virtual microscope. In: Proceedings of the AMIA Annual Fall Symposium. Nashville, TN; 1997. pp. 449–453.Google Scholar
[29] Kumar, V, Andrade, H, Gedik, B, Wu, KL. DEDUCE: At the intersection of MapReduce and stream processing. In: Proceedings of the International Conference on Extending Database Technology (EDBT). Lausanne, Switzerland; 2010. pp. 657–662.Google Scholar
[30] Apache Hadoop; retrieved in March 2011. http://hadoop.apache.org/.
[31] Polikar, R. Ensemble based systems in decision making. IEEE Circuits and Systems Magazine. 2006;6(3):21–15.CrossRefGoogle Scholar
[32] IBM SPSS Modeler; retrieved in March 2011. http://www.spss.com/software/modeler/.
[33] SAS Analytics; retrieved in June 2012. http://www.sas.com/technologies/analytics/.
[34] Weka Data Mining in Java; retrieved in December 2010. http://www.cs.waikato.ac.nz/m1/weka/.

Save book to Kindle

To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×