Conclusion

Henrique C. M. Andrade; Buğra Gedik; Deepak S. Turaga

doi:10.1017/CBO9781139058940.015

13 - Conclusion

from Part VI - Closing notes

Published online by Cambridge University Press: 05 March 2014

Henrique C. M. Andrade ,

Buğra Gedik and

Deepak S. Turaga

Show author details

Henrique C. M. Andrade: Affiliation:
J. P. Morgan
Buğra Gedik: Affiliation:
Bilkent University, Ankara
Deepak S. Turaga: Affiliation:
IBM Thomas J. Watson Research Center, New York

Book contents

Get access

Summary

Stream processing has emerged from the confluence of advances in data management, parallel and distributed computing, signal processing, statistics, data mining, and optimization theory.

Stream processing is an intuitive computing paradigm where data is consumed as it is generated, computation is performed at wire speed, and results are immediately produced, all within a continuous cycle. The rise of this computing paradigm was the result of the need to support a new class of applications. These analytic-centric applications are focused on extracting intelligence from large quantities of continuously generated data, to provide faster, online, and real-time results. These applications span multiple domains, including environment and infrastructure monitoring, manufacturing, finance, healthcare, telecommunications, physical and cyber security, and, finally, large-scale scientific and experimental research.

In this book, we have discussed the emergence of stream processing and the three pillars that sustain it: the programming paradigm, the software infrastructure, and the analytics, which together enable the development of large-scale high-performance SPAs.

In this chapter, we start with a quick recap of the book (Section 13.1), then look at the existing challenges and open problems in stream processing (Section 13.2), and end with a discussion on how this technology may evolve in the coming years (Section 13.3).

Book summary

In the two introductory chapters (Chapters 1 and 2) of the book, we traced the origins of stream processing as well as provided an overview of its technical fundamentals, and a description of the technological landscape in the area of continuous data processing.

Type: Chapter
Information: Fundamentals of Stream Processing
Application Design, Systems, and Analytics
, pp. 487 - 499

DOI: https://doi.org/10.1017/CBO9781139058940.015 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2014

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

[1] Mathworks MATLAB; retrieved in April 2012. http://www.mathworks.com/.

[2] Palm, W. Introduction to MATLAB for Engineers. McGraw Hill; 2010.Google Scholar

[3] Wolfram Research – Mathematica; retrieved in April 2012. http://www.wolfram.com/.

[4] Wolfram, S. The Mathematica Book – Version 4. Cambridge University Press; 1999.Google Scholar

[5] The R Project for Statistical Computing; retrieved in April 2012. http://www.r-project.org/.

[6] Everitt, B. A Handbook of Statistical Analyses Using R. 2nd edn. Chapman & Hall and CRC Press; 2009.Google Scholar

[7] Simplified Wrapper and Interface Generator; retrieved in April 2012. http://www.swig.org/.

[8] Cohn, R, Russel, J. SWIG. VSD; 2012.Google Scholar

[9] Gordon, R. Essential JNI: Java Native Interface. Prentice Hall; 1998.Google Scholar

[10] Press, WH, Flannery, BP, Teukolsky, SA, Vetterling, WT. Numerical Recipes: The Art of Scientific Computing. Cambridge University Press; 1992.Google Scholar

[11] Streams Exchange; retrieved in May 2012. https://www.ibm.com/developer works/mydeveloperworks/groups/service/html/communityview?communityUuid=d4e7dc8d-0efb-44ff-9a82-897202a3021e.

[12] Christensen, E, Curbera, F, Meredith, G, Weerawarana, S. Web Services Description Language (WSDL) 1.1. World Wide Web Consortium (W3C); 2001. http://www.w3.org/TR/wsd1.Google Scholar

[13] Apache Maven; retrieved in April 2012. http://maven.apache.org/.

[14] Sun Microsystems. RPC: Remote Procedure Call Protocol Specification Version 2. The Internet Engineering Task Force (IETF); 1988. RFC 1050.

[15] Gropp, W, Lusk, E, Skjellum, A. Using MPI: Portable Parallel Programming with Message-Passing Interface. MIT Press; 1999.Google Scholar

[16] Booth, D, Haas, H, McCabe, F, Newcomer, E, Champion, M, Ferris, C, et al.Web Services Architecture – W3C Working Group Note. World Wide Web Consortium (W3C); 2004. http://www.w3.org/TR/ws-arch/.Google Scholar

[17] The Object Management Group (OMG), Corba; retrieved in September 2010. http://www.corba.org/.

[18] Schneider, S, Andrade, H, Gedik, B, Biem, A, Wu, KL. Elastic scaling of data parallel operators in stream processing. In: Proceedings of the IEEE International Conference on Parallel and Distributed Processing Systems (IPDPS); 2009. pp. 1–12.Google Scholar

[19] Wolf, J, Bansal, N, Hildrum, K, Parekh, S, Rajan, D, Wagle, R, et al. SODA: An optimizing scheduler for large-scale stream-based distributed computer systems. In: Proceedings of the ACM/IFIP/USENIX International Middleware Conference (Middleware). Leuven, Belgium; 2008. p. 306–325.Google Scholar

[20] Wolf, J, Khandekar, R, Hildrum, K, Parekh, S, Rajan, D, Wu, KL, et al. COLA: Optimizing stream processing applications via graph partitioning. In: Proceedings of the ACM/I-FIP/USENIX International Middleware Conference (Middleware). Urbana, IL; 2009. pp. 308–327.Google Scholar

[21] Stoica, I, Morris, R, Karger, D, Kaashoek, F, Hari, . Chord: A scalable peer-to-peer lookup protocol for internet applications. In: Proceedings of the ACM International Conference on the Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM). San Diego, CA; 2001. p. 149–160.Google Scholar

[22] memcached – A Distributed Memory Object Caching System; retrieved in April 2012. http://memcached.org/.

[23] Losa, G, Kumar, V, Andrade, H, Gedik, B, Hirzel, M, Soulé, R, et al. Language and system support for eficient state sharing in distributed stream processing systems. In: Proceedings of the ACM International Conference on Distributed Event Based Systems (DEBS). Berlin, Germany; 2012.Google Scholar

[24] Fox, A, Gribble, SD, Chawathe, Y, Brewer, EA, Gauthier, P. Cluster-based scalable network services. In: Proceedings of Symposium on Operating System Principles (SOSP). Saint Malo, France; 1997. pp. 78–91.Google Scholar

[25] Shen, K, Yang, T, Chu, L. Clustering support and replication management for scalable network services. IEEE Transactions on Parallel and Distributed Systems (TPDS). 2003;14(11):1168–1179.Google Scholar

[26] Gray, J. Transaction Processing: Concepts and Techniques. Morgan Kaufmann; 1992.Google Scholar

[27] Beynon, M, Ferreira, R, Kurc, T, Sussman, A, Saltz, J. DataCutter: middleware for filtering very large scientific datasets on archival storage systems. In: Proceedings of the IEEE Symposium on Mass Storage Systems (MSS). College Park, MD; 2000. pp. 119–134.Google Scholar

[28] Ferreira, R, Moon, B, Humphries, J, Sussman, A, Miller, R, DeMarzo, A. The virtual microscope. In: Proceedings of the AMIA Annual Fall Symposium. Nashville, TN; 1997. pp. 449–453.Google Scholar

[29] Kumar, V, Andrade, H, Gedik, B, Wu, KL. DEDUCE: At the intersection of MapReduce and stream processing. In: Proceedings of the International Conference on Extending Database Technology (EDBT). Lausanne, Switzerland; 2010. pp. 657–662.Google Scholar

[30] Apache Hadoop; retrieved in March 2011. http://hadoop.apache.org/.

[31] Polikar, R. Ensemble based systems in decision making. IEEE Circuits and Systems Magazine. 2006;6(3):21–15.CrossRef Google Scholar

[32] IBM SPSS Modeler; retrieved in March 2011. http://www.spss.com/software/modeler/.

[33] SAS Analytics; retrieved in June 2012. http://www.sas.com/technologies/analytics/.

[34] Weka Data Mining in Java; retrieved in December 2010. http://www.cs.waikato.ac.nz/m1/weka/.

Book contents

13 - Conclusion

Summary

Access options

References

Save book to Kindle

Save book to Dropbox

Save book to Google Drive