Keynotes



Geoffrey Charles Fox : Designing and Building an analytics Library with the convergence of High Performance Computing and Big Data

Abstract :
    Two major trends in computing systems are the growth in high performance computing (HPC) with an international exascale initiative, and the big data phenomenon with an accompanying cloud infrastructure of well publicized dramatic and increasing size and sophistication.
    We describe a classification of applications that considers separately "data" and "model" and allows one to get a unified picture of large scale data analytics and large scale simulations. We introduce the High Performance Computing enhanced Apache Big Data software Stack HPC-ABDS and give several examples of advantageously linking HPC and ABDS. In particular we discuss a Scalable Parallel Interoperable Data Analytics Library SPIDAL that is being developed to embody these ideas. SPIDAL covers some core machine learning, image processing, graph, simulation data analysis and network science kernels. We use this to discuss the convergence of Big Data, Big Simulations, HPC and clouds.
    We give examples of data analytics running on HPC systems including details on persuading Java to run fast.

Bio :
    Fox received a Ph.D. in Theoretical Physics from Cambridge University and is now distinguished professor of Informatics and Computing, and Physics at Indiana University where he is director of the Digital Science Center, and Chair of Department of Intelligent Systems Engineering at the School of Informatics and Computing. He previously held positions at Caltech, Syracuse University and Florida State University after being a postdoc at the Institute of Advanced Study at Princeton, Lawrence Berkeley Laboratory and Peterhouse College Cambridge. He has supervised the PhD of 67 students and published around 1200 papers in physics and computer science with an hindex of 72 and over 28000 citations.
    He currently works in applying computer science from infrastructure to analytics in Biology, Pathology, Sensor Clouds, Earthquake and Ice-sheet Science, Image processing, Deep Learning, Network Science, Financial Systems and Particle Physics. The infrastructure work is built around Software Defined Systems on Clouds and Clusters. The analytics focuses on scalable parallelism. He is involved in several projects to enhance the capabilities of Minority Serving Institutions. He has experience in online education and its use in MOOCs for areas like Data and Computational Science. He is a Fellow of APS (Physics) and ACM (Computing).


Domenico Talia: Data Analytics in Clouds: Issues, Tools and Challenge

Abstract :
    Digital data is increasing beyond any previous estimation and data stores and sources are more and more pervasive and distributed. Professionals and scientists need advanced data analysis tools and services coupled with scalable architectures to support the extraction of useful information from big data repositories. Cloud computing systems offer an effective support for addressing both the computational and data storage needs of big data mining and parallel knowledge discovery applications. In fact, complex data mining tasks involve data- and compute-intensive algorithms that require large and efficient storage facilities together with high performance processors to get results in acceptable times. In this talk we introduce the topic and the main research issues. We discuss how to make knowledge discovery services scalable and present the Data Mining Cloud Framework designed for developing and executing distributed data analytics applications as workflows of services. In this environment we use data sets, analysis tools, data mining algorithms and knowledge models that are implemented as single services that can be combined through a visual programming interface in distributed workflows to be executed on Clouds. The main features of the programming interface are described and performance evaluation of knowledge discovery applications are discussed.

Bio :
      Domenico Talia is a full professor of Computer Engineering at the DIMES of University of Calabria, Italy and the chair of the ICT Center of Università della Calabria. He is also a partner of two startups -- Exeura and Scalable Data Analytics. His research interests include cloud computing, Grid computing, parallel and distributed data mining algorithms, peer-to-peer systems, parallel programming, and mobile computing and distributed systems. Domenico Talia published seven books and more than 300 papers in archival journals and conference proceedings. He is a member of the editorial boards of many famous journals such as IEEE Transactions on Computers, and he served as a program chair or program committee member of several conferences.


Marian Bubak: Federating Cloud Computing Resources for Scientific Computing

Abstract :
    To be announced.

Bio :
    Marian Bubak has an M.Sc. degree in Technical Physics and Ph.D. in Computer Science. He is an adjunct at the Institute of Computer Science and ACC Cyfronet AGH University of Science and Technology, Kraków, Poland, and a Professor of Distributed System Enginnering at the Universiteit van Amsterdam.
    His research interests include parallel and distributed computing, grid systems, and e-science; he is the author of about 200 papers in this area, co-editor of 28 proceedings of international conferences and the Associate Editor of FGCS Grid Computing.he is also the leader of the Architecture Team in the CrossGrid Project, the Scientific Coordinator of the K-WfGrid, the member of the Integration Monitoring Committee of the CoreGRID, the WP leader in ViroLab and GREDIA EU IST projects, and the strategic Planning Team leader in PL-Grid Project.


Zhiwei Xu: Elastic Processors for Deep Learning

Abstract :
    A central challenge for the computer architecture community is energy efficiency, i.e., to increase by orders of magnitude the number of operations executed per Joule, or Giga operations per second per Watt (GOPS/W). The worldwide scientific computing has set a research goal of achieving Exaflops at 20MW, or 50 GOPS/W, by 2022. In this talk, we raise a research goal of intelligent computing to achieve 1000 GOPS/W, by 2022, with deep learning workloads. This calls for three orders of magnitude improvement over current energy efficiency on CPUs. We present promising research results in elastic processors, a new computer architecture that features Function Instruction Set Computer (FISC), an advance over traditional CISC and RISC processor architecture. We show both emulation and ASIC chip results that are orders of magnitude better than CPUs, on three types of representative workloads in neural network computing.

Bio :
    Zhiwei Xu is a professor and CTO of the Institute of Computing Technology (ICT) of the Chinese Academy of Sciences (CAS). His prior industrial experience included chief engineer of Dawning Corp. (now Sugon as listed in Shanghai Stock Exchange), a leading high-performance computer vendor in China. He currently leads “Cloud-Sea Computing Systems”, a strategic priority research project of the Chinese Academy of Sciences that aims at developing billion-thread computers with elastic processors. Zhiwei Xu holds a Ph.D. degree from the University of Southern California, an MS degree from Purdue University, and a BS degree from University of Electronic Science and Technology of China.


Hai Zhuge: Multi-Dimensional Summarisation in Cyber-Physical Society

Abstract :
    Summarization is one of the key features of human intelligence. It plays an important role in understanding and representation. With rapid and continual expansion of texts, pictures and videos in cyberspace, automatic summarization becomes more and more desirable. Text summarization has been studied for over half century, but it is still hard to automatically generate a satisfied summary. Traditional methods process texts empirically and neglect the fundamental characteristics and principles of language use and understanding. This lecture summarizes previous text summarization approaches in a multi-dimensional classification space, introduces a multi-dimensional methodology for research and development, unveils the basic characteristics and principles of language use and understanding, investigates some fundamental mechanisms of summarization, studies the dimensions and forms of representations, and proposes a multi-dimensional evaluation mechanisms. Investigation extends to the incorporation of pictures into summary and to the summarization of videos, graphs and pictures, and then reaches a general summarization framework.

Bio :
    Prof. Zhuge has made systematic contribution to semantics modeling, knowledge modelling and practice of the cyber-infrastructure for knowledge sharing and management through lasting fundamental innovation on knowledge, semantics, dimension and self-organization. He is extending his research to a long-term plan of cyber-physical society, which concerns multi-disciplinary methodological, theoretical and technical innovations with significant practical impact.


JCJ
SKG2005-SKG2016