8th International Conference on Web Intelligence, Mining and Semantics
June 25 – 27 2018, Novi Sad, Serbia
This turorial offers a rich blend of theory and practice regarding dimensionality reduction methods and graph mining algorithms, to deal with challenging issues such as scalability, data noise, and sparsity in recommender systems. Matrix and tensor decomposition methods have been proven to be the most accurate (i.e., Netflix prize) and efficient for handling big data. For each method (SVD, SVD++, HOSVD, CUR, etc.) we will provide a detailed theoretical mathematical background and a step-by-step analysis, by using an integrated toy example, which runs throughout all parts of the tutorial, helping the audience to understand clearly the differences among factorization methods. Moreover, this tutorial surveys important research in a new family of recommender systems aimed at serving multi-dimensional social networks. We will provide the related work for similarity search on graphs. We will see the random walk-based algorithms (i.e., PageRank, SimRank, Katz, etc.) that can be used to provide contextual recommendations in multi-dimensional graphs, where there are many participating entities (users, locations, products, and the time dimension).
Panagiotis Symeonidis is an assistant professor at the Faculty of Computer Science (scientific sector INF/01) of the Free University of Bozen-Bolzano. Before moving to Bolzano he worked for 8 years as assistant professor at the Department of Informatics in Aristotle University of Thessaloniki, Greece. There, he received a Bachelor (BA) in Applied Informatics from Macedonia University of Greece in 1996. He also received a Master diploma (MSc) in Information Systems from the same University in 2004. He received his PhD in Web Mining and Information Retrieval for Personalization from the Department of Informatics in Aristotle University of Thessaloniki, Greece in 2008. His research interests include web mining (usage mining, content mining and graph mining), information retrieval, collaborative filtering, recommender systems, social media in Web 2.0 and location-based social networks. He is the co-author of 3 international books, 1 Greek book, 4 book chapters, 18 journal publications and 29 conference/workshop publications. His articles have received more than 1800 citations from other scientific publications.
Time-series classification is the common denominator in various recognition tasks, such as signature verification, person identification based on keystroke dynamics, detection of cardiovascular diseases and brain disorders (e.g. early stage of Alzheimer disease or dementia). This tutorial aims to give an overview of most prominent challenges (tasks), methods, evaluation protocols and biomedical applications related to time series classification. Besides the "conventional" time series classification task, early classification and semi-supervised classification will be considered. Both preprocessing techniques - Fourier transformation, SAX, etc. - and most prominent classifiers - such as similarity-based, feature-based, motif/shaplet-based classifiers and convolutional neural networks - will be covered. It will be pointed out that carefully designed evaluation protocols are required in order to assess the quality of the models fairly. This includes, depending on the application scenario, realistic assumptions about the availability of training data, careful (e.g. patient-based) train and test splits, etc. Selected applications will be explained, such as classification of functional magnetic resonance imaging (fMRI) data and person identification based on keystroke dynamics.
Krisztian Buza is currently a post-doc research assistant at the University of Bonn. He obtained his Diploma in Computer Science from the Budapest University of Technology and Economics, in 2007; and his Ph.D. from the University of Hildesheim in 2011. He is a co-author of more than 40 publications, including the "best paper" of the IEEE Conference on Computational Science and Engineering (2010) for his work on individualized error prediction for time series classification. His research focuses on time series classification and biomedical applications of machine learning and data mining.
Methods to identify outliers in traffic data use different techniques and formulations, analyzing and translating the traffic data in different ways in order to use statistical techniques, similarity-based techniques, or techniques based on frequent pattern mining. In this tutorial, we give a structured overview relating various approaches to some fundamental outlier detection models. These classic methods (such as the "Local Outlier Factor") are well-understood and have a clear mathematical notion. By relating complex methods (that are adapted and tailored to such a specific application as traffic data) to abstract and fundamental methods, we can better understand their intuition, limitations, and benefits. As a result, practitioners get some guidance for selecting the most suitable methods for their case.
Youcef Djenouri is post doc in the Department for Mathematics and Computer Science (IMADA) at University of Southern Denmark (SDU), in Odense, Denmark. Previously, he was granted a post-doctoral fellowship from the UNIST university on South Korea, and worked as assistant professor at USDB university in Blida, Algeria. Youcef holds bachelor and master-level degrees in Computer Science, involving studies at USDB and USTHB universities in Algeria, where he was ranked fourth and first, respectively in his promotion. He finished his Ph.D. thesis in computer science on "Parallel Association Rule Mining" at USTHB in December 2014. During his Ph.D. he has been granted short-term research visitor internship to ENSMEA University in Poitiers, France. His research interests include data mining, machine learning, parallel computing and artificial intelligence as well as bio-inspired computing. He published more than 30 papers at peer reviewed international conferences and in international journals. He received the "Best Paper Award" at PAKDD's BDM Workshop 2017. Youcef presented several papers on different data mining and parallel computing venues (PAKDD, WIC, PDP, PPAM, and DCAI). For more information please visit this link.
Arthur Zimek is Associate Professor in the Department for Mathematics and Computer Science (IMADA) at University of Southern Denmark (SDU), in Odense, Denmark. Previously he worked as a Privatdozent in the database systems and data mining group at Ludwig-Maximilians-University Munich (LMU), Germany, as a guest professor at Technical University Vienna, Austria, and as a postdoctoral fellow in the department for Computing Science at University of Alberta, Edmonton, Canada. Arthur holds master-level degrees in bioinformatics, philosophy, and theology, involving studies at universities in Germany (TUM, HfPh, LMU Munich, and JGU Mainz) as well as Austria (LFU Innsbruck). He finished his Ph.D.\ thesis in informatics on "Correlation Clustering" at LMU in summer 2008. For this work he received the "SIGKDD Doctoral Dissertation Award (runner-up)" in 2009. His research interests include ensemble techniques for unsupervised learning, clustering, outlier detection, and high dimensional data, developing data mining methods as well as evaluation methodology. He published more than 70 papers at peer reviewed international conferences and in international journals. Together with his co-authors, he received the "Best Paper Honorable Mention Award" at SDM 2008 and the "Best Demonstration Paper Award" at SSTD 2011. Arthur presented several tutorials on different data mining topics at several conferences (SIGKDD, VLDB, PAKDD, ICDM, SDM, ECMLPKDD). For more information please visit this link.