Google Custom Search

Wir verwenden Google für unsere Suche. Mit Klick auf „Suche aktivieren“ aktivieren Sie das Suchfeld und akzeptieren die Nutzungsbedingungen.

Hinweise zum Einsatz der Google Suche

Technical University of Munich

  • Data Analytics and Machine Learning Group
  • TUM School of Computation, Information and Technology
  • Technical University of Munich

Technical University of Munich

Open Topics

We offer multiple Bachelor/Master theses, Guided Research projects and IDPs in the area of data mining/machine learning. A  non-exhaustive list of open topics is listed below.

If you are interested in a thesis or a guided research project, please send your CV and transcript of records to Prof. Stephan Günnemann via email and we will arrange a meeting to talk about the potential topics.

Robustness of Large Language Models

Type: Master's Thesis

Prerequisites:

  • Strong knowledge in machine learning
  • Very good coding skills
  • Proficiency with Python and deep learning frameworks (TensorFlow or PyTorch)
  • Knowledge about NLP and LLMs

Description:

The success of Large Language Models (LLMs) has precipitated their deployment across a diverse range of applications. With the integration of plugins enhancing their capabilities, it becomes imperative to ensure that the governing rules of these LLMs are foolproof and immune to circumvention. Recent studies have exposed significant vulnerabilities inherent to these models, underlining an urgent need for more rigorous research to fortify their resilience and reliability. A focus in this work will be the understanding of the working mechanisms of these attacks.

We are currently seeking students for the upcoming Summer Semester of 2024, so we welcome prompt applications. This project is in collaboration with  Google Research .

Contact: Tom Wollschläger

References:

  • Universal and Transferable Adversarial Attacks on Aligned Language Models
  • Attacking Large Language Models with Projected Gradient Descent
  • Representation Engineering: A Top-Down Approach to AI Transparency
  • Mechanistically analyzing the effects of fine-tuning on procedurally defined tasks

Generative Models for Drug Discovery

Type:  Mater Thesis / Guided Research

  • Strong machine learning knowledge
  • Proficiency with Python and deep learning frameworks (PyTorch or TensorFlow)
  • Knowledge of graph neural networks (e.g. GCN, MPNN)
  • No formal education in chemistry, physics or biology needed!

Effectively designing molecular geometries is essential to advancing pharmaceutical innovations, a domain which has experienced great attention through the success of generative models. These models promise a more efficient exploration of the vast chemical space and generation of novel compounds with specific properties by leveraging their learned representations, potentially leading to the discovery of molecules with unique properties that would otherwise go undiscovered. Our topics lie at the intersection of generative models like diffusion/flow matching models and graph representation learning, e.g., graph neural networks. The focus of our projects can be model development with an emphasis on downstream tasks ( e.g., diffusion guidance at inference time ) and a better understanding of the limitations of existing models.

Contact :  Johanna Sommer , Leon Hetzel

Equivariant Diffusion for Molecule Generation in 3D

Equivariant Flow Matching with Hybrid Probability Transport for 3D Molecule Generation

Structure-based Drug Design with Equivariant Diffusion Models

Efficient Machine Learning: Pruning, Quantization, Distillation, and More - DAML x Pruna AI

Type: Master's Thesis / Guided Research / Hiwi

The efficiency of machine learning algorithms is commonly evaluated by looking at target performance, speed and memory footprint metrics. Reduce the costs associated to these metrics is of primary importance for real-world applications with limited ressources (e.g. embedded systems, real-time predictions). In this project, you will work in collaboration with the DAML research group and the Pruna AI startup on investigating solutions to improve the efficiency of machine leanring models by looking at multiple techniques like pruning, quantization, distillation, and more.

Contact: Bertrand Charpentier

  • The Efficiency Misnomer
  • A Gradient Flow Framework for Analyzing Network Pruning
  • Distilling the Knowledge in a Neural Network
  • A Survey of Quantization Methods for Efficient Neural Network Inference

Deep Generative Models

Type:  Master Thesis / Guided Research

  • Strong machine learning and probability theory knowledge
  • Knowledge of generative models and their basics (e.g., Normalizing Flows, Diffusion Models, VAE)
  • Optional: Neural ODEs/SDEs, Optimal Transport, Measure Theory

With recent advances, such as Diffusion Models, Transformers, Normalizing Flows, Flow Matching, etc., the field of generative models has gained significant attention in the machine learning and artificial intelligence research community. However, many problems and questions remain open, and the application to complex data domains such as graphs, time series, point processes, and sets is often non-trivial. We are interested in supervising motivated students to explore and extend the capabilities of state-of-the-art generative models for various data domains.

Contact : Marcel Kollovieh , David Lüdke

  • Flow Matching for Generative Modeling
  • Auto-Encoding Variational Bayes
  • Denoising Diffusion Probabilistic Models 
  • Structured Denoising Diffusion Models in Discrete State-Spaces

Graph Structure Learning

Type:  Guided Research / Hiwi

  • Optional: Knowledge of graph theory and mathematical optimization

Graph deep learning is a powerful ML concept that enables the generalisation of successful deep neural architectures to non-Euclidean structured data. Such methods have shown promising results in a vast range of applications spanning the social sciences, biomedicine, particle physics, computer vision, graphics and chemistry. One of the major limitations of most current graph neural network architectures is that they often rely on the assumption that the underlying graph is known and fixed. However, this assumption is not always true, as the graph may be noisy or partially and even completely unknown. In the case of noisy or partially available graphs, it would be useful to jointly learn an optimised graph structure and the corresponding graph representations for the downstream task. On the other hand, when the graph is completely absent, it would be useful to infer it directly from the data. This is particularly interesting in inductive settings where some of the nodes were not present at training time. Furthermore, learning a graph can become an end in itself, as the inferred structure can provide complementary insights with respect to the downstream task. In this project, we aim to investigate solutions and devise new methods to construct an optimal graph structure based on the available (unstructured) data.

Contact : Filippo Guerranti

  • A Survey on Graph Structure Learning: Progress and Opportunities
  • Differentiable Graph Module (DGM) for Graph Convolutional Networks
  • Learning Discrete Structures for Graph Neural Networks

NodeFormer: A Scalable Graph Structure Learning Transformer for Node Classification

A Machine Learning Perspective on Corner Cases in Autonomous Driving Perception  

Type: Master's Thesis 

Industrial partner: BMW 

Prerequisites: 

  • Strong knowledge in machine learning 
  • Knowledge of Semantic Segmentation  
  • Good programming skills 
  • Proficiency with Python and deep learning frameworks (TensorFlow or PyTorch) 

Description: 

In autonomous driving, state-of-the-art deep neural networks are used for perception tasks like for example semantic segmentation. While the environment in datasets is controlled in real world application novel class or unknown disturbances can occur. To provide safe autonomous driving these cased must be identified. 

The objective is to explore novel class segmentation and out of distribution approaches for semantic segmentation in the context of corner cases for autonomous driving. 

Contact: Sebastian Schmidt

References: 

  • Segmenting Known Objects and Unseen Unknowns without Prior Knowledge 
  • Efficient Uncertainty Estimation for Semantic Segmentation in Videos  
  • Natural Posterior Network: Deep Bayesian Uncertainty for Exponential Family  
  • Description of Corner Cases in Automated Driving: Goals and Challenges 

Active Learning for Multi Agent 3D Object Detection 

Type: Master's Thesis  Industrial partner: BMW 

  • Knowledge in Object Detection 
  • Excellent programming skills 

In autonomous driving, state-of-the-art deep neural networks are used for perception tasks like for example 3D object detection. To provide promising results, these networks often require a lot of complex annotation data for training. These annotations are often costly and redundant. Active learning is used to select the most informative samples for annotation and cover a dataset with as less annotated data as possible.   

The objective is to explore active learning approaches for 3D object detection using combined uncertainty and diversity based methods.  

  • Exploring Diversity-based Active Learning for 3D Object Detection in Autonomous Driving   
  • Efficient Uncertainty Estimation for Semantic Segmentation in Videos   
  • KECOR: Kernel Coding Rate Maximization for Active 3D Object Detection
  • Towards Open World Active Learning for 3D Object Detection   

Graph Neural Networks

Type:  Master's thesis / Bachelor's thesis / guided research

  • Knowledge of graph/network theory

Graph neural networks (GNNs) have recently achieved great successes in a wide variety of applications, such as chemistry, reinforcement learning, knowledge graphs, traffic networks, or computer vision. These models leverage graph data by updating node representations based on messages passed between nodes connected by edges, or by transforming node representation using spectral graph properties. These approaches are very effective, but many theoretical aspects of these models remain unclear and there are many possible extensions to improve GNNs and go beyond the nodes' direct neighbors and simple message aggregation.

Contact: Simon Geisler

  • Semi-supervised classification with graph convolutional networks
  • Relational inductive biases, deep learning, and graph networks
  • Diffusion Improves Graph Learning
  • Weisfeiler and leman go neural: Higher-order graph neural networks
  • Reliable Graph Neural Networks via Robust Aggregation

Physics-aware Graph Neural Networks

Type:  Master's thesis / guided research

  • Proficiency with Python and deep learning frameworks (JAX or PyTorch)
  • Knowledge of graph neural networks (e.g. GCN, MPNN, SchNet)
  • Optional: Knowledge of machine learning on molecules and quantum chemistry

Deep learning models, especially graph neural networks (GNNs), have recently achieved great successes in predicting quantum mechanical properties of molecules. There is a vast amount of applications for these models, such as finding the best method of chemical synthesis or selecting candidates for drugs, construction materials, batteries, or solar cells. However, GNNs have only been proposed in recent years and there remain many open questions about how to best represent and leverage quantum mechanical properties and methods.

Contact: Nicholas Gao

  • Directional Message Passing for Molecular Graphs
  • Neural message passing for quantum chemistry
  • Learning to Simulate Complex Physics with Graph Network
  • Ab initio solution of the many-electron Schrödinger equation with deep neural networks
  • Ab-Initio Potential Energy Surfaces by Pairing GNNs with Neural Wave Functions
  • Tensor field networks: Rotation- and translation-equivariant neural networks for 3D point clouds

Robustness Verification for Deep Classifiers

Type: Master's thesis / Guided research

  • Strong machine learning knowledge (at least equivalent to IN2064 plus an advanced course on deep learning)
  • Strong background in mathematical optimization (preferably combined with Machine Learning setting)
  • Proficiency with python and deep learning frameworks (Pytorch or Tensorflow)
  • (Preferred) Knowledge of training techniques to obtain classifiers that are robust against small perturbations in data

Description : Recent work shows that deep classifiers suffer under presence of adversarial examples: misclassified points that are very close to the training samples or even visually indistinguishable from them. This undesired behaviour constraints possibilities of deployment in safety critical scenarios for promising classification methods based on neural nets. Therefore, new training methods should be proposed that promote (or preferably ensure) robust behaviour of the classifier around training samples.

Contact: Aleksei Kuvshinov

References (Background):

  • Intriguing properties of neural networks
  • Explaining and harnessing adversarial examples
  • SoK: Certified Robustness for Deep Neural Networks
  • Certified Adversarial Robustness via Randomized Smoothing
  • Formal guarantees on the robustness of a classifier against adversarial manipulation
  • Towards deep learning models resistant to adversarial attacks
  • Provable defenses against adversarial examples via the convex outer adversarial polytope
  • Certified defenses against adversarial examples
  • Lipschitz-margin training: Scalable certification of perturbation invariance for deep neural networks

Uncertainty Estimation in Deep Learning

Type: Master's Thesis / Guided Research

  • Strong knowledge in probability theory

Safe prediction is a key feature in many intelligent systems. Classically, Machine Learning models compute output predictions regardless of the underlying uncertainty of the encountered situations. In contrast, aleatoric and epistemic uncertainty bring knowledge about undecidable and uncommon situations. The uncertainty view can be a substantial help to detect and explain unsafe predictions, and therefore make ML systems more robust. The goal of this project is to improve the uncertainty estimation in ML models in various types of task.

Contact: Tom Wollschläger ,   Dominik Fuchsgruber ,   Bertrand Charpentier

  • Can You Trust Your Model’s Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift
  • Predictive Uncertainty Estimation via Prior Networks
  • Posterior Network: Uncertainty Estimation without OOD samples via Density-based Pseudo-Counts
  • Evidential Deep Learning to Quantify Classification Uncertainty
  • Weight Uncertainty in Neural Networks

Hierarchies in Deep Learning

Type:  Master's Thesis / Guided Research

Multi-scale structures are ubiquitous in real life datasets. As an example, phylogenetic nomenclature naturally reveals a hierarchical classification of species based on their historical evolutions. Learning multi-scale structures can help to exhibit natural and meaningful organizations in the data and also to obtain compact data representation. The goal of this project is to leverage multi-scale structures to improve speed, performances and understanding of Deep Learning models.

Contact: Marcel Kollovieh , Bertrand Charpentier

  • Tree Sampling Divergence: An Information-Theoretic Metricfor Hierarchical Graph Clustering
  • Hierarchical Graph Representation Learning with Differentiable Pooling
  • Gradient-based Hierarchical Clustering
  • Gradient-based Hierarchical Clustering using Continuous Representations of Trees in Hyperbolic Space

Eindhoven University of Technology research portal Logo

  • Help & FAQ

Data Mining

  • Data Science
  • Data and Artificial Intelligence

Student theses

  • 1 - 50 out of 258 results
  • Title (descending)

Search results

3d face reconstruction using deep learning.

Supervisor: Medeiros de Carvalho, R. (Supervisor 1), Gallucci, A. (Supervisor 2) & Vanschoren, J. (Supervisor 2)

Student thesis : Master

Achieving Long Term Fairness through Curiosity Driven Reinforcement Learning: How intrinsic motivation influences fairness in algorithmic decision making

Supervisor: Pechenizkiy, M. (Supervisor 1), Gajane, P. (Supervisor 2) & Kapodistria, S. (Supervisor 2)

Activity Recognition Using Deep Learning in Videos under Clinical Setting

Supervisor: Duivesteijn, W. (Supervisor 1), Papapetrou, O. (Supervisor 2), Zhang, L. (External person) (External coach) & Vasu, J. D. (External coach)

A Data Cleaning Assistant

Supervisor: Vanschoren, J. (Supervisor 1)

Student thesis : Bachelor

A Data Cleaning Assistant for Machine Learning

A deep learning approach for clustering a multi-class dataset.

Supervisor: Pei, Y. (Supervisor 1), Marczak, M. (External person) (External coach) & Groen, J. (External person) (External coach)

Aerial Imagery Pixel-level Segmentation

A framework for understanding business process remaining time predictions.

Supervisor: Pechenizkiy, M. (Supervisor 1) & Scheepens, R. J. (Supervisor 2)

A Hybrid Model for Pedestrian Motion Prediction

Supervisor: Pechenizkiy, M. (Supervisor 1), Muñoz Sánchez, M. (Supervisor 2), Silvas, E. (External coach) & Smit, R. M. B. (External coach)

Algorithms for center-based trajectory clustering

Supervisor: Buchin, K. (Supervisor 1) & Driemel, A. (Supervisor 2)

Allocation Decision-Making in Service Supply Chain with Deep Reinforcement Learning

Supervisor: Zhang, Y. (Supervisor 1), van Jaarsveld, W. L. (Supervisor 2), Menkovski, V. (Supervisor 2) & Lamghari-Idrissi, D. (Supervisor 2)

Analyzing Policy Gradient approaches towards Rapid Policy Transfer

An empirical study on dynamic curriculum learning in information retrieval.

Supervisor: Fang, M. (Supervisor 1)

An Explainable Approach to Multi-contextual Fake News Detection

Supervisor: Pechenizkiy, M. (Supervisor 1), Pei, Y. (Supervisor 2) & Das, B. (External person) (External coach)

An exploration and evaluation of concept based interpretability methods as a measure of representation quality in neural networks

Supervisor: Menkovski, V. (Supervisor 1) & Stolikj, M. (External coach)

Anomaly detection in image data sets using disentangled representations

Supervisor: Menkovski, V. (Supervisor 1) & Tonnaer, L. M. A. (Supervisor 2)

Anomaly Detection in Polysomnography signals using AI

Supervisor: Pechenizkiy, M. (Supervisor 1), Schwanz Dias, S. (Supervisor 2) & Belur Nagaraj, S. (External person) (External coach)

Anomaly detection in text data using deep generative models

Supervisor: Menkovski, V. (Supervisor 1) & van Ipenburg, W. (External person) (External coach)

Anomaly Detection on Dynamic Graph

Supervisor: Pei, Y. (Supervisor 1), Fang, M. (Supervisor 2) & Monemizadeh, M. (Supervisor 2)

Anomaly Detection on Finite Multivariate Time Series from Semi-Automated Screwing Applications

Supervisor: Pechenizkiy, M. (Supervisor 1) & Schwanz Dias, S. (Supervisor 2)

Anomaly Detection on Multivariate Time Series Using GANs

Supervisor: Pei, Y. (Supervisor 1) & Kruizinga, P. (External person) (External coach)

Anomaly detection on vibration data

Supervisor: Hess, S. (Supervisor 1), Pechenizkiy, M. (Supervisor 2), Yakovets, N. (Supervisor 2) & Uusitalo, J. (External person) (External coach)

Application of P&ID symbol detection and classification for generation of material take-off documents (MTOs)

Supervisor: Pechenizkiy, M. (Supervisor 1), Banotra, R. (External person) (External coach) & Ya-alimadad, M. (External person) (External coach)

Applications of deep generative models to Tokamak Nuclear Fusion

Supervisor: Koelman, J. M. V. A. (Supervisor 1), Menkovski, V. (Supervisor 2), Citrin, J. (Supervisor 2) & van de Plassche, K. L. (External coach)

A Similarity Based Meta-Learning Approach to Building Pipeline Portfolios for Automated Machine Learning

Aspect-based few-shot learning.

Supervisor: Menkovski, V. (Supervisor 1)

Assessing Bias and Fairness in Machine Learning through a Causal Lens

Supervisor: Pechenizkiy, M. (Supervisor 1)

Assessing fairness in anomaly detection: A framework for developing a context-aware fairness tool to assess rule-based models

Supervisor: Pechenizkiy, M. (Supervisor 1), Weerts, H. J. P. (Supervisor 2), van Ipenburg, W. (External person) (External coach) & Veldsink, J. W. (External person) (External coach)

A Study of an Open-Ended Strategy for Learning Complex Locomotion Skills

A systematic determination of metrics for classification tasks in openml, a universally applicable emm framework.

Supervisor: Duivesteijn, W. (Supervisor 1), van Dongen, B. F. (Supervisor 2) & Yakovets, N. (Supervisor 2)

Automated machine learning with gradient boosting and meta-learning

Automated object recognition of solar panels in aerial photographs: a case study in the liander service area.

Supervisor: Pechenizkiy, M. (Supervisor 1), Medeiros de Carvalho, R. (Supervisor 2) & Weelinck, T. (External person) (External coach)

Automatic data cleaning

Automatic scoring of short open-ended questions.

Supervisor: Pechenizkiy, M. (Supervisor 1) & van Gils, S. (External coach)

Automatic Synthesis of Machine Learning Pipelines consisting of Pre-Trained Models for Multimodal Data

Automating string encoding in automl, autoregressive neural networks to model electroencephalograpy signals.

Supervisor: Vanschoren, J. (Supervisor 1), Pfundtner, S. (External person) (External coach) & Radha, M. (External coach)

Balancing Efficiency and Fairness on Ride-Hailing Platforms via Reinforcement Learning

Supervisor: Tavakol, M. (Supervisor 1), Pechenizkiy, M. (Supervisor 2) & Boon, M. A. A. (Supervisor 2)

Benchmarking Audio DeepFake Detection

Better clustering evaluation for the openml evaluation engine.

Supervisor: Vanschoren, J. (Supervisor 1), Gijsbers, P. (Supervisor 2) & Singh, P. (Supervisor 2)

Bi-level pipeline optimization for scalable AutoML

Supervisor: Nobile, M. (Supervisor 1), Vanschoren, J. (Supervisor 1), Medeiros de Carvalho, R. (Supervisor 2) & Bliek, L. (Supervisor 2)

Block-sparse evolutionary training using weight momentum evolution: training methods for hardware efficient sparse neural networks

Supervisor: Mocanu, D. (Supervisor 1), Zhang, Y. (Supervisor 2) & Lowet, D. J. C. (External coach)

Boolean Matrix Factorization and Completion

Supervisor: Peharz, R. (Supervisor 1) & Hess, S. (Supervisor 2)

Bootstrap Hypothesis Tests for Evaluating Subgroup Descriptions in Exceptional Model Mining

Supervisor: Duivesteijn, W. (Supervisor 1) & Schouten, R. M. (Supervisor 2)

Bottom-Up Search: A Distance-Based Search Strategy for Supervised Local Pattern Mining on Multi-Dimensional Target Spaces

Supervisor: Duivesteijn, W. (Supervisor 1), Serebrenik, A. (Supervisor 2) & Kromwijk, T. J. (Supervisor 2)

Bridging the Domain-Gap in Computer Vision Tasks

Supervisor: Mocanu, D. C. (Supervisor 1) & Lowet, D. J. C. (External coach)

CCESO: Auditing AI Fairness By Comparing Counterfactual Explanations of Similar Objects

Supervisor: Pechenizkiy, M. (Supervisor 1) & Hoogland, K. (External person) (External coach)

Clean-Label Poison Attacks on Machine Learning

Supervisor: Michiels, W. P. A. J. (Supervisor 1), Schalij, F. D. (External coach) & Hess, S. (Supervisor 2)

Data Mining research of bachelor's degree dissertation of College of Electronic and electrical engineering in a university in China

Ieee account.

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

data mining Recently Published Documents

Total documents.

  • Latest Documents
  • Most Cited Documents
  • Contributed Authors
  • Related Sources
  • Related Keywords

Distance Based Pattern Driven Mining for Outlier Detection in High Dimensional Big Dataset

Detection of outliers or anomalies is one of the vital issues in pattern-driven data mining. Outlier detection detects the inconsistent behavior of individual objects. It is an important sector in the data mining field with several different applications such as detecting credit card fraud, hacking discovery and discovering criminal activities. It is necessary to develop tools used to uncover the critical information established in the extensive data. This paper investigated a novel method for detecting cluster outliers in a multidimensional dataset, capable of identifying the clusters and outliers for datasets containing noise. The proposed method can detect the groups and outliers left by the clustering process, like instant irregular sets of clusters (C) and outliers (O), to boost the results. The results obtained after applying the algorithm to the dataset improved in terms of several parameters. For the comparative analysis, the accurate average value and the recall value parameters are computed. The accurate average value is 74.05% of the existing COID algorithm, and our proposed algorithm has 77.21%. The average recall value is 81.19% and 89.51% of the existing and proposed algorithm, which shows that the proposed work efficiency is better than the existing COID algorithm.

Implementation of Data Mining Technology in Bonded Warehouse Inbound and Outbound Goods Trade

For the taxed goods, the actual freight is generally determined by multiplying the allocated freight for each KG and actual outgoing weight based on the outgoing order number on the outgoing bill. Considering the conventional logistics is insufficient to cope with the rapid response of e-commerce orders to logistics requirements, this work discussed the implementation of data mining technology in bonded warehouse inbound and outbound goods trade. Specifically, a bonded warehouse decision-making system with data warehouse, conceptual model, online analytical processing system, human-computer interaction module and WEB data sharing platform was developed. The statistical query module can be used to perform statistics and queries on warehousing operations. After the optimization of the whole warehousing business process, it only takes 19.1 hours to get the actual freight, which is nearly one third less than the time before optimization. This study could create a better environment for the development of China's processing trade.

Multi-objective economic load dispatch method based on data mining technology for large coal-fired power plants

User activity classification and domain-wise ranking through social interactions.

Twitter has gained a significant prevalence among the users across the numerous domains, in the majority of the countries, and among different age groups. It servers a real-time micro-blogging service for communication and opinion sharing. Twitter is sharing its data for research and study purposes by exposing open APIs that make it the most suitable source of data for social media analytics. Applying data mining and machine learning techniques on tweets is gaining more and more interest. The most prominent enigma in social media analytics is to automatically identify and rank influencers. This research is aimed to detect the user's topics of interest in social media and rank them based on specific topics, domains, etc. Few hybrid parameters are also distinguished in this research based on the post's content, post’s metadata, user’s profile, and user's network feature to capture different aspects of being influential and used in the ranking algorithm. Results concluded that the proposed approach is well effective in both the classification and ranking of individuals in a cluster.

A data mining analysis of COVID-19 cases in states of United States of America

Epidemic diseases can be extremely dangerous with its hazarding influences. They may have negative effects on economies, businesses, environment, humans, and workforce. In this paper, some of the factors that are interrelated with COVID-19 pandemic have been examined using data mining methodologies and approaches. As a result of the analysis some rules and insights have been discovered and performances of the data mining algorithms have been evaluated. According to the analysis results, JRip algorithmic technique had the most correct classification rate and the lowest root mean squared error (RMSE). Considering classification rate and RMSE measure, JRip can be considered as an effective method in understanding factors that are related with corona virus caused deaths.

Exploring distributed energy generation for sustainable development: A data mining approach

A comprehensive guideline for bengali sentiment annotation.

Sentiment Analysis (SA) is a Natural Language Processing (NLP) and an Information Extraction (IE) task that primarily aims to obtain the writer’s feelings expressed in positive or negative by analyzing a large number of documents. SA is also widely studied in the fields of data mining, web mining, text mining, and information retrieval. The fundamental task in sentiment analysis is to classify the polarity of a given content as Positive, Negative, or Neutral . Although extensive research has been conducted in this area of computational linguistics, most of the research work has been carried out in the context of English language. However, Bengali sentiment expression has varying degree of sentiment labels, which can be plausibly distinct from English language. Therefore, sentiment assessment of Bengali language is undeniably important to be developed and executed properly. In sentiment analysis, the prediction potential of an automatic modeling is completely dependent on the quality of dataset annotation. Bengali sentiment annotation is a challenging task due to diversified structures (syntax) of the language and its different degrees of innate sentiments (i.e., weakly and strongly positive/negative sentiments). Thus, in this article, we propose a novel and precise guideline for the researchers, linguistic experts, and referees to annotate Bengali sentences immaculately with a view to building effective datasets for automatic sentiment prediction efficiently.

Capturing Dynamics of Information Diffusion in SNS: A Survey of Methodology and Techniques

Studying information diffusion in SNS (Social Networks Service) has remarkable significance in both academia and industry. Theoretically, it boosts the development of other subjects such as statistics, sociology, and data mining. Practically, diffusion modeling provides fundamental support for many downstream applications (e.g., public opinion monitoring, rumor source identification, and viral marketing). Tremendous efforts have been devoted to this area to understand and quantify information diffusion dynamics. This survey investigates and summarizes the emerging distinguished works in diffusion modeling. We first put forward a unified information diffusion concept in terms of three components: information, user decision, and social vectors, followed by a detailed introduction of the methodologies for diffusion modeling. And then, a new taxonomy adopting hybrid philosophy (i.e., granularity and techniques) is proposed, and we made a series of comparative studies on elementary diffusion models under our taxonomy from the aspects of assumptions, methods, and pros and cons. We further summarized representative diffusion modeling in special scenarios and significant downstream tasks based on these elementary models. Finally, open issues in this field following the methodology of diffusion modeling are discussed.

The Influence of E-book Teaching on the Motivation and Effectiveness of Learning Law by Using Data Mining Analysis

This paper studies the motivation of learning law, compares the teaching effectiveness of two different teaching methods, e-book teaching and traditional teaching, and analyses the influence of e-book teaching on the effectiveness of law by using big data analysis. From the perspective of law student psychology, e-book teaching can attract students' attention, stimulate students' interest in learning, deepen knowledge impression while learning, expand knowledge, and ultimately improve the performance of practical assessment. With a small sample size, there may be some deficiencies in the research results' representativeness. To stimulate the learning motivation of law as well as some other theoretical disciplines in colleges and universities has particular referential significance and provides ideas for the reform of teaching mode at colleges and universities. This paper uses a decision tree algorithm in data mining for the analysis and finds out the influencing factors of law students' learning motivation and effectiveness in the learning process from students' perspective.

Intelligent Data Mining based Method for Efficient English Teaching and Cultural Analysis

The emergence of online education helps improving the traditional English teaching quality greatly. However, it only moves the teaching process from offline to online, which does not really change the essence of traditional English teaching. In this work, we mainly study an intelligent English teaching method to further improve the quality of English teaching. Specifically, the random forest is firstly used to analyze and excavate the grammatical and syntactic features of the English text. Then, the decision tree based method is proposed to make a prediction about the English text in terms of its grammar or syntax issues. The evaluation results indicate that the proposed method can effectively improve the accuracy of English grammar or syntax recognition.

Export Citation Format

Share document.

data mining bachelor thesis

Bachelor and Master Theses

BSc (97), MSc (74), MSc SciComp (15), Diploma (5)

  • Alexander Kosnac:  Quantity-centric Summarization Techniques for Documents , Bachelor Thesis, March 2024.
  • Nicolas Hellthaler: Footnote-Augmented Documents for Passage Retrieval , Bachelor Thesis, February 2024.
  • Simon Gimmini: Exploring Temporal Patterns in Art Through Diffusion Models , Master Thesis, February 2024
  • Xingqi Cheng:  A Rule-based Post-processor for Temporal Knowledge Graph Extrapolation , Master Thesis, January 2024.
  • Raphael Ebner: Leveraging Large Language Models for Information Extraction and Knowledge Representation , Bachelor Thesis, January 2024.
  • Angelina Basova: Table Extraction from PDF Documents , Master Thesis, December 2023
  • Milena Bruseva:  Benchmarking Vector Databases: A Framework for Evaluating Embedding Based Retrieval , Master Thesis, December 2023.
  • Luis Wettach:  Medical Electronic Data Capture at Home – A Privacy Compliant Framework , Master Thesis, December 2023.
  • Jayson Pyanowski:  Semantic Search with Contextualized Query Generation , Master Thesis, December 2023.
  • Philipp Göldner: Information Retrieval using Sparse Embeddings , Master Thesis, December 2023.
  • Vivian Kazakova:  A Topic Modeling Framework for Biomedical Text Analysis , Bachelor Thesis, October 2023
  • Dennis Geiselmann: Context-Aware Dense Retrieval , Master Thesis, October 2023.
  • Konrad Goldenbaum: Semantic Search and Topic Exploration of Scientific Paper Corpora , Bachelor Thesis, October 2023
  • Yingying Cao:  Keyword-based Summarization of (Legal) Documents , Master Thesis Scientific Computing, August, 2023.
  • Julian Freyberg: Structural and Logical Document Layout Analysis u sing Graph Neural Networks , Master Thesis, August 2023.
  • Marina Walther:  A Universal Online Social Network Conversation Model , Master Thesis, August 2023.
  • David Pohl:  Zero-Shot Word Sense Disambiguation using Word Embeddings , Bachelor Thesis, August 2023
  • Klemens Gerber:  Automatic Enrichment of Company Information in Knowledge Graphs , Master Thesis, August 2023.
  • Bastian Müller:  An Adaptable Question Answering Framework with Source-Citations , Bachelor Thesis, August 2023
  • Jiahui Li:  Styled Text Summarization via Domain-specific Paraphrasing ,  Master Thesis Scientific Computing, July 2023.
  • Sophia Matthis: Multi-Aspect Exploration of Plenary Protocols , Master Thesis, June 2023.
  • Till Rostalski:  A Generic Patient Similarity Framework for Clinical Data Analysis , Bachelor Thesis, June 2023
  • David Jackson:  Automated Extraction of Drug Analysis and Discovery Networks , Master Thesis Scientific Computing, May 2023.
  • Christopher Brückner:  Multi-Feature Clustering of Search Results , Master Thesis, April 2023.
  • Paul Dietze:  Formula Classification and Mathematical Token Embeddings , Bachelor Thesis, April 2023.
  • Sophia Hammes:  A Neural-Based Approach for Link Discovery in the Process Management Domain , Master Thesis, March 2023.
  • Fabian Kneissl:  Time-Dependent Graph Modeling of Twitter Conversations , Master Thesis, March 2023.
  • Lucienne-Sophie Marmé:   A Bootstrap Approach for Classifying Political Tweets into Policy Fields , Bachelor Thesis, March 2023.
  • Jing Fan: Assessing Factual Accuracy of Generated Text Using Semantic Role Labeling , Bachelor Thesis, March 2023.
  • Fabio Gebhard:  A Rule-based Approach for Numerical Question Answering , Master Thesis, December 2022.
  • Severin Laicher:  Learning and Exploring Similarity of Sales Items and its Dependency on Sales Data , Master Thesis, September 2022.
  • Raeesa Yousaf: Explainability of Graph Roles Extracted from Networks , Bachelor Thesis, September 2022.
  • Julian Seibel: Towards GAN-based Open-World Knowledge Graph Completion , Master Thesis, June 2022.
  • Claire Zhao Sun: Extracting and Exploring Causal Factors from Financial Documents , Master Thesis Scientific Computing, May 2022.
  • Ziqiu Zhou:  Semantic Extensions of OSM Data Through Mining Tweets in the Domain of Disaster Management , Master Thesis, May 2022.
  • Lukas Ballweg:  Analysis of Lobby Networks and their Extraction from Semi-Structured Data ,  Bachelor Thesis, April 2022.
  • Benjamin Wagner:  Benchmarking Graph Databases for Knowledge Graph Handling , Bachelor Thesis, March 2022.
  • Cedric Bender:  Exploration and Analysis of Methods for German Tweet Stream Summarization , Bachelor Thesis, March 2022. 
  • Johannes Klüh:  Polyphonic Music Generation for Multiple Instruments using Music Transformer , Bachelor Thesis, March 2022.
  • Nicolas Reuter: Automatic Annotation of Song Lyrics Using Wikipedia Resources , Master Thesis, December 2021.
  • Mateusz Chrzastek: Extraktive Keyphrases form Noun Chunk Similarity , Bachelor Thesis, October 2021. 
  • Fabrizio Primerano: Document Information Extraction from Visually-rich Documents with Unbalanced Class Structure , Master Thesis, October 2021.
  • Sarah Marie Bopp: Gender-centric Analysis of Tweets from German Politicians , Bachelor Thesis, September 2021.
  • Philipp Göldner: A Framework for Numerical Information Extraction , Bachelor Thesis, July 2021.
  • Robin Khanna: Adaptive Topic Modelling for Twitter Data , Bachelor Thesis, July 2021.
  • Thomas Rekers: Correlating Postings from Different Social Media Platforms , Master Thesis, July 2021.
  • Duc Anh Phi: Background Linking of News Articles , Master Thesis, May 2021.
  • Eike Harms: Linking Table and Text Quantities in Documents , Master Thesis, April 2021.
  • Raphael Arndt: Regelbasierte Binärklassifizierung von Webseiten , Bachelor Thesis, April 2021.
  • Jonas Gann: Integrating Identity Management Providers based on Online Access Law , Bachelor Thesis, March 2021.
  • Björn Ternes: Kontextbasierte Informationsextraktion aus Datenschutzerklärungen , Bachelor Thesis, March 2021.
  • Fabio Becker: A Generative Model for Dynamic Networks with Community Structures , Master Thesis, December 2020.
  • Jan-Gabriel Mylius: Visual Analysis of Paragraph Similarity , Bachelor Thesis, December 2020
  • Alexander Hebel: Information Retrieval mit PostgreSQL , Master Thesis, November 2020.
  • Jonas Albrecht: Lexikon-basierte Sentimentanalyse von Tweets , Bachelor Thesis, November 2020.
  • Marina Walther: A Network-based Approach to Investigate Medical Time Series Data , Bachelor Thesis, September 2020.
  • Stefan Hickl: Automatisierte Generierung von Inhaltsverzeichnissen aus PDF-Dokumenten , Bachelor Thesis, September 2020.
  • Christopher Brückner: Structure-centric Near-Duplicate Detection , Bachelor Thesis, August 2020.
  • David Jackson: Extracting Knowledge Graphs from Biomedical Literature , Bachelor Thesis, August 2020.
  • David Richter: Single-Pass Training von Klassifikatoren basierend auf einem großem Web-Korpus , Master Thesis, August 2020.
  • Julian Freyberg: Time-sensitive Multi-label Classification of News Articles , Bachelor Thesis, July 2020.
  • John Ziegler: Modelling and Exploration of Property Graphs for Open Source Intelligence , Master Thesis, August, 2020.
  • Johannes Keller: A Network-based Approach for Modeling Twitter Topics , Master Thesis, June 2020.
  • Erik Koynov : Three Stage Statute Retrieval Algorithm with BERT and Hierachical Pretraining" , Bachelor Thesis, Mai 2020.
  • Fabian Kaiser: Cross-Reference Resolution in German and European Law , Master Thesis, April 2020.
  • Hasan Malik: Open Numerical Information Extraction , Master Thesis, Scientific Computing, March 2020.
  • Matthias Rein: Exploration of User Networks and Content Analysis of the German Political Twittersphere , Master Thesis, March 2020.
  • Philip Hausner: Time-centric Content Exploration in Large Document Collections , Master Thesis, March 2020.
  • Mohammad Dawas: On the Analysis of Networks Extracted from Relational Databases , Master Thesis, Scientific Computing, February 2020.
  • Lea Zimmermann: Mapping Machine Learning Frameworks to End2End Infrastructures , Bachelor Thesis, February 2020
  • Bente Nittka: Modelling Verdict Documents for Automated Judgment Grounds Prediction , Bachelor Thesis, November 2019
  • Michael Pronkin: A Framework for a Person-Centric Gazetteer Service , Bachelor Thesis, November 2019
  • Jessica Löhr: Analysis and Exploration of Register Data of Companies , Bachelor Thesis, October 2019
  • Seida Basha: Extraction of Comment Threads of Political News Articles , Bachelor Thesis, September 2019
  • Lukas Rüttgers: Analyse von YouTube-Kommentaren zur Förderung von Diskussionen , Master Thesis, Scientific Computing, July 2019
  • Gloria Feher: Concepts in Context: A Network-based Approach , Master Thesis, July 2019
  • Dennis Aumiller: Implementation of a Relational Document Hypergraph for Information Retrieval , Master Thesis, April 2019
  • Raheel Ahsan: Efficient Entity Matching , Master Thesis, Scientific Computing, March 2019
  • Christian Straßberger: Time-Varying Graphs to Explore Medical Time Series , Master Thesis, Scientific Computing, March 2019
  • Frederik Schwabe: Zitationsnetzwerke in Gesetzestexten und juristischen Entscheidungen , Bachelor Thesis, February 2019
  • Kilian Claudius Valenti: Extraktion und Exploration von Kookkurenznetzwerken aus Arztbriefen , Bachelor Thesis, February 2019
  • Satya Almasian: Learning Joint Vector Representation of Words and Named Entities , Master Thesis, Scientific Computing, October 2018
  • Naghmeh Fazeli: Evolutionary Analysis of News Article Networks , Master Thesis, Scientific Computing, October 2018
  • Lukas Kades: Development and Evaluation of an Indoor Simulation Model for Visitor Behaviour on a Trade Fair , Master Thesis, October 2018
  • David Stronczek: Named Entity Disambiguation using Implicit Networks , Master Thesis, August 2018
  • Julius Franz Foitzik: A Social Network Approach towards Location-based Recommendation , Master Thesis, April 2018
  • Carine Dengler: Network-based Modeling and Analysis of Political Debates , Master Thesis, May 2018
  • Maximilian Langknecht: Exploration-Based Feature Analysis of Time Series Using Minimum Spanning Trees ,  Bachelor Thesis, May 2018
  • Jayson Salazar: Extraction and Analysis of Dynamic Co-occurence Networks from Medical Text , Master Thesis, Scientific Computing, April 2018
  • Fabio Becker: Toponym Resolution in HeidelPlace , Bachelor Thesis, April 2018
  • Felix Stern: Correlating Finance News Articles and Stock Indexes , Master Thesis, March 2018
  • Oliver Hommel: Symbolical Inversion of Formulas in an OLAP Context , Master Thesis, Scientific Computing,  March 2018
  • Jan Greulich: Reasoning with Imprecise Temporal and Geographical Data , Master Thesis, February 2018
  • Johannes Visintini: Modelling and Analyzing Political Activity Networks , Bachelor Thesis, February 2018
  • Sebastian Lackner:  Efficient Algorithms for Anti-community Detection , Master Thesis, February 2018
  • Leonard Henger: Erstellung eines konzeptionellen Datenmodells für Zeitreihen und Erkennung von Zeitreihenausreißern , Bachelor Thesis, December 2017
  • Christian Kromm: Short-term travel time prediction in complex contents , Master Thesis, December 2017
  • Christian Schütz: A Generative Model for Correlated Geospatial Property Graphs with Social Network Characteristics , Bachelor Thesis, December 2017
  • Sophia Stahl: Association Rule Based Pattern Mining of Cancer Genome Variants , Master Thesis, December 2017
  • Patrick Breithaupt: Evolving Topic-centric Information Networks , Master Thesis, October 2017
  • Michael Müller: Graph Based Event Summarization , Master Thesis, September 2017
  • Slavin Donchev: Statement Extraction from German Newspaper Articles , Bachelor Thesis, August 2017
  • Dennis Aumiller: Mining Relationship Networks from University Websites , Bachelor Thesis, August 2017
  • Katja Hauser: Latent Information Networks from German Newspaper Articles , Bachelor Thesis, April 2017
  • Xiaoyu Ye: Extraction and Analysis of Organization and Person Networks , Master Thesis, April 2017
  • Martin Enderlein: Modeling and Exploring Company Networks , Bachelor Thesis, January 2017
  • Ludwig Richter: A Generic Gazetter Data Model and an Extensible Framework for Geoparsing , Master Thesis, October 2016
  • Benjamin Keller: Matching Unlabeled Instances against a Known Data Schema Using Active Learning , Bachelor Thesis, August 2016
  • Julien Stern: Generation and Analysis of Event Networks from GDELT Data , Bachelor Thesis, July 2016
  • Hüseyin Dagaydin: Personalized Filtering of SAP Internal Search Results based on Search Behavior , Master Thesis, March 2016
  • Zaher Aldefai: Improvement of SAP Search HANA results through Text Analysis , Master Thesis, April 2016
  • Jens Cram: Adapting In-Memory Representations of Property Graphs to Mixed Workloads , Bachelor Thesis, April 2016
  • Antonio Jiménez Fernández: Collection and Analysis of User Generated Comments on News Articles , Bachelor Thesis, April 2016
  • Nils Weiher: Temporal Affiliation Network Extraction from Wikidata , Bachelor Thesis, March 2016
  • Claudia Dünkel: Erweiterung des Wu-Holme Modells für Zitationsnetzwerke , Bachelor Thesis, January 2016
  • Muhammad El-Hindi: VisIndex: A Multi-dimensional Tree Index for Histogram Queries , Master Thesis, December 2015
  • Annika Boldt: Rahmenwerk für kontextsensitive Hilfe von webbasierten Anwendungen , Master Thesis, December 2015
  • Carine Dengler: Das INDY-Bildanalyseframework für die Geschichtswissenschaften , Bachelor Thesis, October 2015
  • Leif-Nissen Lundbaek: Conceptional analysis of cryptocurrencies towards smart financial networks , Master Thesis, Scientific Computing, October 2015
  • Viktor Bersch: Effiziente Identifikation von Ereignissen zur Auswertung komplexer Angriffsmuster auf IT Infrastrukturen , Master Thesis, September 2015
  • Ranjani Dilip Banhatti: Graph Regularization Parameter for Non-Negative Matrix Factorization , Master Thesis, Scientific Computing, September 2015
  • Konrad Kühne: Temporal-Topological Analysis of Evolutionary Message Networks , Bachelor Thesis, July 2015
  • Stefanie Bachmann: The K-Function and its use for Bandwidth Parameter Estimation , Bachelor Thesis, July 2015
  • Philipp Daniel Freiberger: Temporal Evolution of Communities in Affiliation Networks , Bachelor Thesis, June 2015
  • Johannes Auer: Bewertung von GitHub Projekten anhand von Eventdaten , Bachelor Thesis, March 2015
  • Christian Kromm: Erkennung und Analyse von Regionalen Hashtag Communities in Twitter , Bachelor Thesis, March 2015
  • Matthias Brandt: Evolution of Correlation of Hashtags in Twitter, Master Thesis, February 2015
  • Jonas Scholten: Effizientes Indexing von Twitter-Daten für temporale und räumliche TopK-Suche unter Verwendung von Mongo DB , Bachelor Thesis, February 2015
  • Patrick Breithaupt: Experimentelle Analyse des Exponetial Random Graph Modells , Bachelor Thesis, February 2015
  • Timm Schäuble: Classification of Temporal Relations between Events , Bachelor Thesis, January 2015
  • Andreas Spitz: Analysis and Exploration of Centrality and Referencing Patterns in Networks of News Articles, Master Thesis , November 2014
  • Tobias Zatti: Simulation und Erweiterung von sozialen Netzwerken durch Random Graphs am Beispiel von Twitter , Bachelor Thesis, November 2014
  • Ludwig Richter: Automated Field-Boundary Detection by Trajectory Analysis of Agricultural Machinery , Bachelor Thesis, August 2014
  • Thomas Metzger: Mining Sequential Patterns from Song Lists , Bachelor Thesis, July 2014
  • Arthur Arlt: Determining Rates of False Positives and Negatives in Fast Flux Botnet Detection , Master Thesis, July 2014.
  • Hanna Lange: Stream-based Event and Place Detection from Social Media , June 2014
  • Christian Karr: Effektive Indexierung von räumlichen und zeitlichen Daten , Bachelor Thesis, May 2014
  • Haikuhi Jaghinyan: Evaluation of the HANA Graph Engine, Bachelor Thesis, March 2014
  • Sebastian Rode: Speeding Up Graph Traversals in the SAP HANA Database , Diploma Thesis, Mathematics/Computer Science, March 2014
  • Isil Özge Pekel: Performing Cluster Analysis on Column Store Databases , Master Thesis, March 2014
  • Andreas Runk: Integrating Information about Persons from Linked Open Data , Master Thesis, February 2014
  • Tobias Limpert: Verbesserung der spatio-temporal Event Extraktion und ihrer Kontextinformation durch Relationsextraktionsmethoden , Bachelor Thesis, December 2013
  • Christian Seyda: Comparison of graph-based and vector-space geographical topic detection , Master Thesis, December 2013
  • Bartosz Bogasz: Generation of Place Summaries from Wikipedia , Master Thesis, December 2013
  • David Richter: Segmentierung geographischer Regionen aus Social Media mittels Superpixelverfahren , Bachelor Thesis, Oktober 2013
  • Marek Walkowiak: Gazetteer-gestützte Erkennung und Disambiguierung von Toponymen in Text , Bachelor Thesis, Oktober 2013
  • Mirko Kiefer: Histo: A Protocol for Peer-to-Peer Data Synchronization in Mobile Apps , Bachelor Thesis, September 2013
  • Daniel Egenolf: Extraktion und Normalisierung von Personeninformation für die Kombination mit Spatio-temporal Events , Bachelor Thesis, September 2013
  • Lisa Tuschner: Tag-Recommendation auf Basis von Flickr Daten , Bachelor Thesis, September 2013
  • Edward-Robert Tyercha: An Efficient Access Structure to 3D Mesh Data with Column Store Databases , Master Thesis, September 2013
  • Matthias Iacsa: Study of NetPLSA with respect to regularization in multidimensional spaces , Bachelor Thesis, Juli 2013
  • Timo Haas: Analyse und Exploration von temporalen Aspekten in OSM-Daten , Bachelor Thesis, June 2013
  • Julian Wintermayr: Evaluation of Semantic Web storage solutions focusing on Spatial and Temporal Queries , Bachelor Thesis, June 2013
  • Bertil Nestorius Baron: Aggregate Center Queries in Dynamic Road Networks , Diploma Thesis, Mathematics/Computer Science, Mai 2013
  • Viktor Bersch: Methoden zur temporalen Analyse und Exploration von Reviews , Bachelor Thesis, Mai 2013
  • Cornelius Ratsch: Adaptive String Dictionary Compression in In-Memory Column-Store Database Systems , Master Thesis, April 2013
  • Andreas Zerkowitz: Aufbau und Analyse eines Event-Repository aus Wikipedia , Bachelor Thesis, April 2013
  • Erik von der Osten: Influential Graph Properties of Collaborative-Filtering based Recommender Systems , Diploma Thesis, Mathematics/Computer Science, March 2013
  • Philipp Harth: Local Similarity in Geometric Graphs via Spectral Correspondence , Master Thesis, February 2013
  • Benjamin Kirchholtes: A General Solution for the Point Cloud Docking Problem , Master Thesis, February 2013
  • Manuel Kaufmann: Modellierung und Analyse heuristischer und linguistischer Methoden zur Eventextraktion , Bachelor Thesis, November 2012
  • Dennis Runz: Socio-Spatial Event Detection in Dynamic Interaction Graphs , Master Thesis, November 2012
  • Andreas Schuster: Compressed Data Structures for Tuple Identifiers in Column-Oriented Databases , Master Thesis, October 2012
  • Christian Kapp: Person Comparison based on Name Normalization and Spatio-temporal Events , Master Thesis, September 2012
  • Jörg Hauser: Algorithms for Model Assignment in Multi-Gene Phylogenetics , Master Thesis, August 2012
  • Andreas Klein: The CSGridFile for Managing and Querying Point Data in Column Stores , Master Thesis, August 2012
  • Andreas Runk: Dynamisches Rerouting in Strassennetzwerken , Bachelor Thesis, August 2012
  • Markus Neusinger: Erkennung von Sternströmen mit Hilfe moderner Clusteringverfahren , Diploma Thesis Physics/Computer Science, August 2012
  • Clemens Maier: Visualisierung und Modellierung des auf BRF+ aufgebauten Workflows , Bachelor Thesis, August 2012
  • Daniel Kruck: Investigation of Exact Graph and Tree Isomorphism Problems , Bachelor Thesis, July 2012
  • Andreas Fay: Correlation and Exploration of Events , Master Thesis, February 2012
  • Cornelius Ratsch: Extending Context-Aware Query Autocompletion , Bachelor Thesis, February 2012
  • Alexander Wilhelm: Spezifikation und Suche komplexer Routen in Strassennetzwerken , Diploma Thesis, Mathematics/Computer Science, February 2012
  • Britta Keller: Ein Event-basiertes Ähnlichkeitsmodell für biomedizinische Dokumente , Bachelor Thesis, February 2012
  • Simon Jarke: Effiziente Suche von Substrukturen in grossen geometrischen Graphen , Master Thesis, November 2011
  • Markus Kurz: Visualizing and Exploring Nonparametric Density Estimations of Context-aware Itemsets , Bachelor Thesis, October 2011
  • Frank Tobian: Modelle und Rankingverfahren zur Kombination von textueller und geographischer Suche , Bachelor Thesis, September 2011
  • Alexander Hochmuth: Efficient Computation of Hot Spots in Road Networks , Bachelor Thesis, June 2011
  • Selina Raschack: Spezifikation von Mustern auf räumlichen Daten und Suche von zugehörigen Musterinstanzen , Bachelor Thesis, Mai 2011
  • Bechir Ben Slama: Dynamische Erkennung von Ausreißern in Straßennetzwerken , Master Thesis, March 2011
  • Marcus Schaber: Scalable Routing using Spatial Database Systems , Bachelor Thesis, March 2011
  • Edward-Robert Tyercha: Co-Location Pattern Mining mit MapReduce , Bachelor Thesis, March 2011
  • Benjamin Hiller: Analyse und Verarbeitung von OpenStreetMap-Daten mit MapReduce , Bachelor Thesis, March 2011
  • Serge Thiery Akoa Owona: Apache Cassandra as Database System for the Activiti BPM Engine , Bachelor Thesis, February 2011
  • Maik Häsner: Bestimmung und Überwachung von Hot Spots in Strassennetzwerken , Master Thesis, October 2010.
  • Philipp Harth: Scale-Dependent Pattern Mining on Volunteered Geographic Information , Bachelor Thesis, August 2010.
  • Peter Artmann: Design and Implementation of a Rule-based Warning and Messaging System , Bachelor Thesis, June 2010.
  • Christopher Röcker: Analyse und Rekonstruktion unvollständiger Sensordaten , Bachelor Thesis, March 2010.
  • Andreas Klein: Eine Indexstruktur zur Verwaltung und Anfrage an Moving Regions auf Grundlage des TPR∗-Baumes , Bachelor Thesis, February 2010.
  • Benjamin Kirchholtes: Object Recognition and Extraction in Satellites Images using the Insight Segmentation and Registration Toolkit (ITK) , Bachelor Thesis, February 2010.
  • Fabian Rühle: Performance Analysis of Column-based Main Memory Databases , Bachelor Thesis, December 2009.
  • Pavel Popov: GeoDok: Extraktion und Visualisierung von Ortsinformationen in Dokumenten , Bachelor Thesis, Dezember 2009.
  • Zur Metanavigation
  • Zur Hauptnavigation
  • Zur Subnavigation
  • Zum Seitenfuss

Photo: Sarah Buth

Bachelor and Master Thesis

We offer a variety of cutting-edge and exciting research topics for Bachelor's and Master's theses. We cover a wide range of topics from Data Science, Natural Language Processing, Argument Mining, the Use of AI in Business, Ethics in AI and Multimodal AI. We are always open to suggestions for your own topics, so please feel free to contact us. We supervise students from all disciplines of business administration, business informatics, computer science and industrial engineering.

Thesis Topics

Example topics could be:

  • Conversational Artificial Intelligence in Insurance and Finance
  • Natural Language Processing for Understanding Financial Narratives: An Overview
  • Ethics at the Intersection of Finance and AI: A Comprehensive Literature Review
  • Explainable Natural Language Processing for Credit Risk Assessment Models: A Literature Review

Thesis Template

  • Latex Template for bachelor and master theses
  • How to use the latex template

Q1: How many pages do I need to write?

A: In general, the number of pages is only a poor indicator of the quality of a thesis. However, as a rule of thumb, bachelor theses should have around 30 pages, while master theses should be around 60 pages of main content (that is, without the appendix and lists of tables, symbols, figures, references etc.).

Q2: How often should I meet with my supervisor?

A: Your supervisors are typically very busy people. However, don't hesitate to ask in case you have questions. For instance, if you are unsure of some requirements, or in case you have methodological problems, it is absolutely necessary to talk to your supervisor. As a rule of thumb, you should meet at least three times (once in the beginning, once in the middle, and once before the submission).

Q3: Am I allowed to use any AI models in the process of writing my thesis?

A: In general, we neither forbid nor recommend the use of AI for writing support. However, if you use AI, please inform your supervisor. Also, you need to adhere to the recommendations on the use of AI writing assistants given by the faculty.

Q4: How much time do I have?

A: The exact timing is dependent on your study program! Thus, please check the examination requirements before the official start of your thesis -- you are responsible for sticking to the rules.

homepage of the instituion

  • Contact/Imprint

Humboldt-Universität zu Berlin - Faculty of Mathematics and Natural Sciences - Process Management and Information Systems

Bachelor and master thesis.

General Information

Our team offers bachelor and master thesis topics as well as student projects to be written in English.

Concerning the theses, there are two application windows in a year in which new topics are available. The first window is open from February 1st until April 1st. The second window is open from July 1st until October 1st. During these windows students can express their interest in a topic by sending an email Dr. Saimir Bala (firstname[dot]lastname[at]hu-berlin.de).

There are biannually  info sessions  where we explain the process of writing a thesis with our team. The next info session is scheduled for Februray 21 st , 2024 [zoom link we be available soon].

The last info session took place on September 28th, 2023. Here you can find the slides ( part I ,  part II ) and recordings ( part I ,  part II ) of previous info sessions . 

Furthermore, find below a summary of guidelines for working on your thesis with us.

Process Overview

  • There are two main time windows in which the team proposes new topics: Feb 1st ­– Apr 1st and Jul 1st – Oct 1st
  • Within these windows students can apply for an open topic (see list of open topics below)
  • Application is done by sending an email to Dr. Saimir Bala (firstname[dot]lastname[at]hu-berlin.de).
  • We collect your applications and make a topic-student assignment in two rounds. First round on March, second round after the deadline.
  • Once a student has been matched to a supervisor, a meeting is scheduled to scope the topic.
  • Then, students must submit a research proposal to the supervisor within a month.
  • If the proposal is graded as passed, the supervision is officially registered
  • Once the thesis work is concluded, the thesis defense is scheduled within a dedicated defense slot.

Important Dates

09.02.2024: New topics released. Students can express their interest.

21.02.2024: Info session at 12:00 [Zoom link here ]

01.03.2024: Topic assignment (1st round)

01.04.2024: Expression of interest deadline

08.04.2024: Topic assignment (2nd round)

08.05.2024: Research proposal submission

15.05.2024: Official start (if proposal sufficient)

Please consider the following hints and guidelines for working on your thesis:

  • Templates for thesis and proposal: https://www.informatik.hu-berlin.de/de/studium/formulare/vorlagen
  • Page limits are as follows
  • page limit is for Bachelor Informatik 40 pages and for Kombibachelor Lehramt Informatik 30 pages
  • page limit is for Master Informatik 80 pages and for Master Information Systems 60 pages
  • The limits do not include cover, table of content, references, and appendices.

Prerequisites

The candidate is expected to be familiar with the general rules of writing a scientific paper. Some general references are helpful for framing any thesis, no matter which topic:

  • Wil van der Aalst:  How to Write Beautiful Process and Data Science Papers?  Archive Report (2022).
  • Jan Recker:  Scientific Research in Information Systems: A Beginner's Guide  . Springer, Heidelberg, Germany (2021).
  • Jan Mendling, Benoit Depaire, Henrik Leopold: Methodology of Algorithm Engineering . Archive Report (2023).
  • Claes Wohlin, Pär Runeson, Martin Höst, Magnus Ohlsson, Björn Regnell, Anders Wesslén  Experimentation in software engineering  . Springer Science & Business Media (2012).
  • Ken Peffers, Tuure Tuunanen, Marcus A. Rothenberger, Samir Chatterjee:  A Design Science Research Methodology for Information Systems Research  . J of Management Information Systems 24(3): 45-77 (2008).
  • Barbara Kitchenham, Rialette Pretorius, David Budgen, Pearl Brereton, Mark Turner, Mahmood Niazi, Stephen G. Linkman:  Systematic literature reviews in software engineering - A tertiary study  . Information & Software Technology 52(8): 792-805 (2010).
  • Lagendijk, Ad.  Survival Guide for Scientists: Writing, Presentation, Email  . Amsterdam University Press (2008).
  • Adam LeBrocq: Journal of the Association for Information Systems Style Guide.  https://aisel.aisnet.org/cais/cais_style_guide.pdf

In agreement with the supervisor an individual list of expected readings should be studied by the student in preparation of the actual work on the thesis.

The grading of the thesis takes various criteria into account, relating both to the thesis as a product and the process of establishing its content. These include, but are not limited to:

  • Correctness of spelling and grammar
  • Aesthetic appeal of documents and figures
  • Compliance with formal rules
  • Appropriateness of thesis structure
  • Coverage of relevant literature
  • Appropriateness of research question and method
  • Diligence of own research work
  • Significance of research results
  • Punctuality of work progress
  • Proactiveness of handling research progress

Recent Topics

If you are interested in one of the following topics, please send an email expressing your interest to Dr. Saimir Bala (firstname[dot]lastname[at]hu-berlin.de).  Please explain  why this topic is interesting for you and how it fits your prior studies. Also explain what are your strengths in your studies and in which semester of your studies you are.

Topic 1: Actionable Recommendation for Learner in Learning Management System based on Process Mining (Bachelor/Master) 

This study explores the advantages of process mining in learning management systems to provide actionable recommendations to learners. By leveraging data-driven insights, it aims to enhance the learning experience by offering personalized guidance and suggestions to learners, ultimately improving their educational outcomes. This research delves into the potential benefits of process mining in the educational context, highlighting its capacity to empower learners on their educational journeys.

Initial References :

  • Wambsganss, Thiemo; Schmitt, Anuschka; Mahnig, Thomas; Ott, Anja; Soellner, Sigitai; Ngo, Ngoc Anh; and Geyer-Klingeberg, Jerome, "The Potential of Technology-Mediated Learning Processes: A Taxonomy and Research Agenda for Educational Process Mining" (2021). ICIS 2021 Proceedings. 1. https://aisel.aisnet.org/icis2021/diglearn_curricula/diglearn_curricula/1
  • AlQaheri, H.; Panda, M. An Education Process Mining Framework: Unveiling Meaningful Information for Understanding Students’ Learning Behavior and Improving Teaching Quality. Information 2022, 13, 29. https://doi.org/10.3390/info13010029
  • Bala, S., Revoredo, K., Mendling, J. (2023). Process Mining for Analyzing Open Questions Computer-Aided Examinations. In: Montali, M., Senderovich, A., Weidlich, M. (eds) Process Mining Workshops. ICPM 2022. Lecture Notes in Business Information Processing, vol 468. Springer, Cham. https://doi.org/10.1007/978-3-031-27815-0_41

Supervisor: Rachmadita Andre Swari

Topic 2: Process discovery based on undesirable traces (Bachelor/Master)

Background : Process discovery techniques have been used to automatically learn a process model using observed traces (i.e., event logs). The traces are assumed correct and the final process model is expected to explain all the traces (i.e., desirable fitness of 1). However, the final model may also explain behaviors that are known by the specialist to be undesirable.

Research problem : The core research problem addressed is: How to learn a process model considering undesirable traces?

The aim is to propose a method to process discovery that learns using desirable and undesirable trace data.

Requirements : The candidate must have previous knowledge of process mining and software development. Further desirable requirements are pro-activity and self-organization. Initial Reference

  • Revoredo, K.: On the use of domain knowledge for process model repair. Softw. Syst. Model. (2022)

Supervisor: Kate Cerqueira Revoredo

Topic 3: Context-aware process monitoring  (Bachelor/Master)

Background : Business process monitoring is one of the phases of the BPM cycle concerned with extracting insights from the execution of a process. The digitization of the processes of an organisation has made available a vast amount of trace data about the execution of these processes, which allows for the use of data-driven process monitoring techniques. Given that, in many situations, it is not enough to just directly use the activities information present in the trace data of the process to achieve an accurate output, recent approaches have considered other sources of information combined with activities information, such as sensors data, or domain knowledge. However,, in most situations, additional data is used in a non-systematic way.

Research problem : The core research problem addressed is: How can contextual data be used for process monitoring? The aim is to propose a method to process monitoring using contextual data.

Requirements : The candidate must have previous knowledge of process mining and software development. Further desirable requirements are pro-activity and self-organization. Initial References:

  • da Cunha Mattos, T., Santoro, F.M., Revoredo, K., Nunes, V.T.: A formal representation for context-aware business processes. Computers in Industry 65(8) (2014) 1193–1214
  • Chamorro, A.E.M., Revoredo, K., Resinas, M., del-R ́ıo-Ortega, A., Santoro, F.M., Ruiz-Cort ́es, A.: Context-aware process performance indicator prediction. IEEE Access 8 (2020) 222050–222063
  • Bayomie, D., Revoredo, K., Mendling, J.: Multi-perspective process analysis: Mining the association between control flow and data objects. In: CAiSE. Volume 13295 of Lecture Notes in Computer Science., Springer (2022) 72–89

Topics 4: Uses of Models in Agile Software Development (Bachelor/Master)

Motivation & problem : Modeling is a key topic in software engineering. In software development projects, among other aspects, modeling supports the developer in understanding the design by providing an overview and a tool for communication with fellow developers and other stakeholders. The benefits of models for supporting system analysis and design activities have been highlighted regarding their cognitive effectiveness, often in the context of traditional methodologies. However, these benefits have also been discussed in the agile scene, but it is still not clear to what extent models are used in agile software development projects.

Objectives : conduct a systematic review of the literature, identify the uses of models in agile software development, categorize and prioritize them, and propose a framework to support agile software development based on these findings. The findings shall be evaluated according to the perspective of practitioners.

Prerequisites : (1) Basic knowledge of agile software development methodologies; (2) Intermediate knowledge of models used in software development; (3) Pro-activity, self-organization, attention to detail (desirable).

Initial References:

  • Ambler, Scott W. The object primer: Agile model-driven development with UML 2.0. Cambridge University Press, 2004.
  • Alfraihi, Hessa Abdulrahman A., and Kevin Charles Lano. "The integration of agile development and model driven development: A systematic literature review." The 5th International Confrence on Model-Driven Engineeing and Software Development (2017).
  • Wagner, Stefan, Daniel Méndez Fernández, Michael Felderer, Antonio Vetrò, Marcos Kalinowski, Roel Wieringa, Dietmar Pfahl et al. "Status quo in requirements engineering: A theory and a global family of surveys." ACM Transactions on Software Engineering and Methodology (TOSEM) 28, no. 2 (2019): 1-48.
  • Petre, Marian. "UML in practice." In 2013 35th international conference on software engineering (icse), pp. 722-731. IEEE, 2013.

Supervisor: Cielo González Moyano

Topic 5: Artificial Intelligence in Project Management for Software Development Projects (Master)

Motivation & problem : Artificial intelligence is applied in software engineering management for taking decisions, estimating, managing technical debt, and planning, just to provide some examples. These applications have been widely studied by researchers. However, there is no study that presents a deep overview of how artificial intelligence is used for management activities in software development projects. Given the rising interest in artificial intelligence and the need of optimizing management in software projects, having a holistic overview can potentially be beneficial for practitioners and researchers.

Objectives : conduct a systematic review of the literature to identify the status quo on the topic. The findings shall be evaluated from the perspective of practitioners. The results shall be used to provide a framework that supports project managers of software development projects.

Prerequisites : (1) Basic knowledge of project management for software development projects; (2) Intermediate knowledge of artificial intelligence; (3) Pro-activity, self-organization, attention to detail (desirable).

Initial references:

  • Perkusich, Mirko, et al. "Intelligent software engineering in the context of agile software development: A systematic literature review." Information and Software Technology 119 (2020): 106241.
  • Kotti, Z., Galanopoulou, R., & Spinellis, D. (2023). Machine learning for software engineering: A tertiary study. ACM Computing Surveys, 55(12), 1-39.
  • Fridgeirsson, Thordur Vikingur, et al. "An authoritative study on the near future effect of artificial intelligence on project management knowledge areas." Sustainability 13.4 (2021): 2345.

Topic 6: Literature Review on Business Intelligence and Human Factors (Bachelor or Master in Information Systems)

Motivation & problem : New types of analytical tools fundamentally change the way how process analysts do their work, with the expectation to drastically impact various professional services including auditing or business process management. Recent years have seen an increasing uptake of process mining tools by corporations and by professional services companies, where they are used to support the analysis of business processes. Recent research on the organisational impact of process mining highlights benefits for process awareness and overall value creation, but potential negative effects are hardly understood.

Primary objective : Review the literature on business intelligence and big data analytics and investigate where negative effects as discussed by Sutton et al (2023) and Parasuraman et al (2000) are discussed.

Prerequisites : (1) Basic knowledge of process mining; (2) Basic knowledge of business process management; (3) Interest in human-computer interaction and engineering psychology.

  • Sutton, S. G., Arnold, V., & Holt, M. (2023). An extension of the theory of technology dominance: Capturing the underlying causal complexity. International Journal of Accounting Information Systems, 50, 100626.
  • Grover, V., Chiang, R. H., Liang, T. P., & Zhang, D. (2018). Creating strategic business value from big data analytics: A research framework. Journal of management information systems, 35(2), 388-423.
  • Parasuraman, R., Sheridan, T. B., & Wickens, C. D. (2000). A model for types and levels of human interaction with automation. IEEE Transactions on systems, man, and cybernetics-Part A: Systems and Humans, 30(3), 286-297.
  • Zimmermann, L., Zebra, F., & Weber, B. (2023). What makes life for process mining analysts difficult? A reflection of challenges. Software and Systems Modeling, 1-29.

Supervisor: Jan Mendling

Topic 7: Visualizing Cyclic Time Arrangements in Process Graphs (Bachelor/Master)

Time is essential to understanding processes, yet most process mining approaches are limited to depicting time within a process graph as textual cues or color schemes. Adapting the visual appearance of process graphs to various time arrangements may enhance the accessibility for finding bottlenecks or delays. An example is aligning process graphs along a linear timeline [1]. In cases where processes involve repetitive patterns, such as in chronic health care or crop management, a cyclic arrangement may be useful. However, for the latter, an adequate solution in process mining is needed.

This thesis aims to develop and exemplify a design method for a visual solution in process mining that allows for exploring a cyclic time arrangement in a process graph. We will adapt the research objectives to align with the experience and study goals of the student.

  • H. Kaur, J. Mendling, C. Rubensson, and T. Kampik, “Timeline-based Process Discovery,” CoRR, abs/2401.04114, 2024. Available: https://doi.org/10.48550/arXiv.2401.04114
  • A. Yeshchenko and J. Mendling, “A Survey of Approaches for Event Sequence Analysis and Visualization using the ESeVis Framework.,” CoRR, abs/2202.07941, 2022. Available: https://arxiv.org/abs/2202.07941
  • W. Aigner, S. Miksch, H. Schumann, and C. Tominski, Visualization of Time-Oriented Data. in Human-Computer Interaction Series. London: Springer London, 2011. Available: https://doi.org/10.1007/978-0-85729-079-3.

Supervisor: Christoffer Rubensson

Topic 8: Advanced Resource Analysis in Process Mining (Bachelor/Master)

In the last decade, process mining techniques have been developed to study human behavior in event data, such as the strength of collaboration between co-workers or even stress levels at a workplace. Since measuring human behavior is complex, this is a welcoming alternative to more labor-intensive methods like surveys. Still, most techniques are relatively simple but could be improved by applying theoretical frameworks from social science.

This thesis aims to develop a resource analysis approach (e.g., a metric, a concept, or a framework) in process mining grounded in an existing theory from social science. We will adapt the research objectives to align with the experience and study goals of the student.

  • J. Nakatumba and W. M. P. van der Aalst, “Analyzing Resource Behavior Using Process Mining,” in Business Process Management Workshops. BPM 2009. Lecture Notes in Business Information Processing, S. Rinderle-Ma, S. Sadiq, and F. Leymann, Eds., Berlin, Heidelberg: Springer Berlin Heidelberg, 2010. Available: https://doi.org/10.1007/978-3-642-12186-9_8.
  • A. Pika, M. Leyer, M. T. Wynn, C. J. Fidge, A. H. M. Ter Hofstede, and W. M. P. Van der Aalst, “Mining Resource Profiles from Event Logs,” in ACM Transactions on Management Information Systems, vol. 8, no. 1, 1:1-30, 2017. Available: https://doi.org/10.1145/3041218.
  • Z. Huang, X. Lu, and H. Duan, “Resource behavior measure and application in business process management,” in Expert Systems with Applications, vol. 39, no. 7, 6458–6468, 2012. Available: https://doi.org/10.1016/j.eswa.2011.12.061.

Topic 9: Anthropomorphic Perceptions of Large Language Models: what is the gender of ChatGPT and its Counterparts? (Bachelor/Master)

Description : In today's digital era, Large Language Models (LLMs) like ChatGPT are transforming the way we interact with technology, often blurring the boundaries between machine and human cognition. This thesis delves into the intriguing realm of anthropomorphism, the human tendency to attribute human-like qualities to non-human entities. Specifically, this research aims to uncover laypeople's underlying beliefs and implicit conceptions about ChatGPT and similar models concerning an implicit gender attribution. By designing and conducting a survey, the thesis will gain insights into individuals' perception of these cutting-edge technologies. The findings can potentially illuminate not only our relationship with LLMs but also the broader implications of human-machine interactions in an increasingly AI-driven world.

  • Deshpande, A., Rajpurohit, T., Narasimhan, K., & Kalyan, A. (2023). Anthropomorphization of AI: Opportunities and Risks (arXiv:2305.14784). arXiv. https://doi.org/10.48550/arXiv.2305.14784
  • Farina, M., & Lavazza, A. (2023). ChatGPT in society: Emerging issues. Frontiers in Artificial Intelligence, 6. https://www.frontiersin.org/articles/10.3389/frai.2023.1130913
  • Aşkın, G., Saltık, İ., Boz, T. E., & Urgen, B. A. (2023). Gendered Actions with a Genderless Robot: Gender Attribution to Humanoid Robots in Action. International Journal of Social Robotics, 15(11), 1915–1931. https://doi.org/10.1007/s12369-022-00964-0

Supervisor: Jennifer Haase

Topic 10: Process prediction using object-centric event log (Bachelor/Master)

Business process prediction involves forecasting specific details, such as the next activity to be performed, the time remaining for the completion of a process instance, or key process indicators, for an ongoing process instance. Currently, the techniques rely on XES event logs as input data. However, the field of process mining is shifting towards utilizing object-centric event logs, which offer a comprehensive multidimensional view of the data. Despite this advancement, object-centric event logs have been underutilized as input for process prediction.

Research problem:  The core research problem addressed is: How can process prediction benefit from an object-centric event log? The aim is to propose a method to process prediction using object-centric event log.

Requirements:  The candidate must have previous knowledge of process mining and software development. Further desirable requirements are pro-activity and self-organization.

Initial references

  • An Empirical Investigation of Different Classifiers, Encoding, and Ensemble Schemes for Next Event Prediction Using Business Process Event Logs. ACM Trans. Intell. Syst. Technol. 11(6): 68:1-68:34 (2020)
  • Uncovering Object-Centric Data in Classical Event Logs for the Automated Transformation from XES to OCEL. BPM 2022: 379-396
  • Benedikt Knopp, Wil M. P. van der Aalst:Order Management Object-centric Event Log in OCEL 2.0 Standard. Zenodo, 2023

Supervisor: Kate Revoredo

Topic 11: Causation discovery for process prediction (Bachelor/Master)

Business process prediction involves forecasting specific details, such as the next activity to be performed, the time remaining for the completion of a process instance, or key process indicators, for an ongoing process instance. Currently, most techniques rely on the order in which the events happened without considering the cause-effect relation among them.

Research problem : The core research problem addressed is: How can process prediction benefit from the cause-effect relation among the events? The aim is to propose a method to discover the cause relation among events and use this information for process prediction.

Requirements:  The candidate must have previous knowledge of process mining, statistics, and software development. Further desirable requirements are pro-activity and self-organization.

  •  An Empirical Investigation of Different Classifiers, Encoding, and Ensemble Schemes for Next Event Prediction Using Business Process Event Logs. ACM Trans. Intell. Syst. Technol. 11(6): 68:1-68:34 (2020)
  • Jens Brunk, Matthias Stierle, Leon Papke, Kate Revoredo, Martin Matzner, Jörg Becker: Cause vs. effect in context-sensitive prediction of business process instances. Inf. Syst. 95: 101635 (2021)
  • Pearl,J.(2011).Bayesiannetworks.

Topic 12: Literature review on quality characteristics in dashboards, business intelligence systems, balanced scorecards, and other reporting solutions: a study of visualization methods (Bachelor)

This bachelor thesis aims to conduct a comprehensive literature review on quality characteristics in dashboards, business intelligence systems, balanced scorecards, and other reporting solutions. The focus will be on comparing various visualization methods employed in these systems. The study intends to provide insights into the key features that contribute to the effectiveness and user satisfaction of visual reporting tools, helping to guide the selection and implementation of suitable solutions in diverse organizational contexts. Initial references :

  • Burstein, F., & Holsapple, C. W. (2008). Handbook on Decision Support Systems 2. https://www.academia.edu/83497312/Handbook_on_Decision_Support_Systems_2
  • Trieu, V.-H. (2023). Towards an understanding of actual business intelligence technology use: An individual user perspective. Information Technology & People, 36(1), 409–432.
  • Webster, J., & Watson, R. T. (2002). Analyzing the past to prepare for the future: Writing a literature review. MIS Quarterly, xiii–xxiii.

Supervisor: Kristina Schneider

Topic 13: Analysis of theoretical explanations and scientific theories on transitioning from dashboards to decision making in organizational contexts (Bachelor)

This bachelor thesis seeks to analyze theoretical explanations and scientific theories concerning the transition from dashboards to decision-making processes. Dashboards are widely used tools in organizational contexts for decision-making. The study aims to examine the levels of management where dashboards are employed and how they contribute to the decision-making process within organizations. Initial references :

  • Maynard, S., Burstein, F., & Arnott, D. (2001). A multi-faceted decision support system evaluation approach. Journal of Decision Systems, 10(3–4), 395–428.
  • Mintzberg, H., Raisinghani, D., & Theoret, A. (1976). The Structure of “Unstructured” Decision Processes. Administrative Science Quarterly, 21(2), 246. https://doi.org/10.2307/2392045

Topic 14: Enhancing Student Engagement in Online Learning Environments through Process Mining (Bachelor/Master)

This study investigates how process mining techniques can be leveraged to enhance student engagement within online learning environments. It explores the utilization of data-driven insights to optimize learning pathways, identify patterns of student interaction, and design personalized interventions to foster greater engagement and participation in digital educational platforms.

  • Rohani, N., Gal, K., Gallagher, M., Manataki, A. (2023). Discovering Students’ Learning Strategies in a Visual Programming MOOC Through Process Mining Techniques. In: Montali, M., Senderovich, A., Weidlich, M. (eds) Process Mining Workshops. ICPM 2022. Lecture Notes in Business Information Processing, vol 468. Springer, Cham. https://doi.org/10.1007/978-3-031-27815-0_39
  • Umer, R., Susnjak, T., Mathrani, A. and Suriadi, S. (2017), "On predicting academic performance with process mining in learning analytics", Journal of Research in Innovative Teaching & Learning, Vol. 10 No. 2, pp. 160-176. https://doi.org/10.1108/JRIT-09-2017-0022
  • Nkomo, L.M., Nat, M. Student Engagement Patterns in a Blended Learning Environment: an Educational Data Mining Approach. TechTrends 65, 808–817 (2021). https://doi.org/10.1007/s11528-021-00638-0

Topic 15: Runtime Prediction of Alignment Construction Algorithms (Bachelor/Master)

Conformance Checking relates a process model to recorded instances of the execution of the process, typically stored in event logs, to determine where expected and actual behaviour deviate from each other. In this context alignment algorithms are regarded as the de facto standard method, due to their interpretability and accuracy in highlighting precise problem areas in the process. Yet, typically run times for alignment construction are prohibitively large, typically caused by a handful of traces in the log, for which the construction of an alignment is especially complex. One possible solution to this problem could lie in predicting the expected runtime of aligning single traces in the log, for instance using regression-based methods and then ignoring traces, that are expected to take long.

In this thesis, the student will:

  • derive a methodology for predicting the runtime of alignment construction between event logs and process models
  • evaluate the accuracy of the predictor
  • assess the factors that influence the runtime of alignments

The student is expected to have knowledge of process mining, conformance checking, and basic knowledge of regression analysis, or willingness to dive deep into these topics.

  • Carmona, J., van Dongen, B., Solti, A., & Weidlich, M. (2018). Conformance checking. Switzerland: Springer.
  • Backhaus, K., Erichson, B., Weiber, R., Plinke, W. (2016). Regressionsanalyse. In: Multivariate Analysemethoden. Springer Gabler, Berlin, Heidelberg.

Supervisor: Martin Kabierski

Topic 16: Biodiversity-based Saturation for Grounded Theory (Master)

Grounded theory is a research methodology usually applied in qualitative analysis. It involves the collection of data (usually through interviews, surveys, ...), and the deduction of concepts, categories, and ultimately theories that emerge from the collected data. A central question to this iterative data collection-evaluation process is when to stop collecting data. Usually one stops when the categories are saturated, i.e. when no new insights are obtained. Determining when exactly this point has been reached is a topic of discussion and research. Species richness estimators, that estimate the completeness of samples, could be utilized to give saturation estimates that are data-driven and grounded in statistics.

  • assess the applicability of species richness estimation for determining saturation in grounded theory
  • implement and apply the estimator to qualitative interview data
  • evaluate the feasibility of the approach and discuss potential limitations

The student is expected to have a solid understanding of statistics and ideally preliminary experience in the analysis of qualitative data. Note, that the student is not expected to collect data for the thesis. These will be provided by us.

  • Strauss, A., & Corbin, J. (1994). Grounded theory methodology: An overview. In N. K. Denzin & Y. S. Lincoln (Eds.), Handbook of qualitative research (pp. 273–285). Sage Publications, Inc.
  • Saunders, Benjamin, et al. (2018). Saturation in qualitative research: exploring its conceptualization and operationalization. In: Qual Quant 52 (pp. 1893-1907). Springer
  • Colwell, Robert K., et al. "Models and estimators linking individual-based and sample-based rarefaction, extrapolation and comparison of assemblages." _Journal of plant ecology_ 5.1 (2012): 3-21

HU on the internet

  • Humboldt University on Facebook
  • Humboldt University on Twitter
  • Humboldt University on Instagram
  • Humboldt University on YouTube
  • Humboldt University on LinkedIn
  • RSS-Feeds of the Humboldt University

youtube logo

The Future of AI Research: 20 Thesis Ideas for Undergraduate Students in Machine Learning and Deep Learning for 2023!

A comprehensive guide for crafting an original and innovative thesis in the field of ai..

By Aarafat Islam on 2023-01-11

“The beauty of machine learning is that it can be applied to any problem you want to solve, as long as you can provide the computer with enough examples.” — Andrew Ng

This article provides a list of 20 potential thesis ideas for an undergraduate program in machine learning and deep learning in 2023. Each thesis idea includes an  introduction , which presents a brief overview of the topic and the  research objectives . The ideas provided are related to different areas of machine learning and deep learning, such as computer vision, natural language processing, robotics, finance, drug discovery, and more. The article also includes explanations, examples, and conclusions for each thesis idea, which can help guide the research and provide a clear understanding of the potential contributions and outcomes of the proposed research. The article also emphasized the importance of originality and the need for proper citation in order to avoid plagiarism.

1. Investigating the use of Generative Adversarial Networks (GANs) in medical imaging:  A deep learning approach to improve the accuracy of medical diagnoses.

Introduction:  Medical imaging is an important tool in the diagnosis and treatment of various medical conditions. However, accurately interpreting medical images can be challenging, especially for less experienced doctors. This thesis aims to explore the use of GANs in medical imaging, in order to improve the accuracy of medical diagnoses.

2. Exploring the use of deep learning in natural language generation (NLG): An analysis of the current state-of-the-art and future potential.

Introduction:  Natural language generation is an important field in natural language processing (NLP) that deals with creating human-like text automatically. Deep learning has shown promising results in NLP tasks such as machine translation, sentiment analysis, and question-answering. This thesis aims to explore the use of deep learning in NLG and analyze the current state-of-the-art models, as well as potential future developments.

3. Development and evaluation of deep reinforcement learning (RL) for robotic navigation and control.

Introduction:  Robotic navigation and control are challenging tasks, which require a high degree of intelligence and adaptability. Deep RL has shown promising results in various robotics tasks, such as robotic arm control, autonomous navigation, and manipulation. This thesis aims to develop and evaluate a deep RL-based approach for robotic navigation and control and evaluate its performance in various environments and tasks.

4. Investigating the use of deep learning for drug discovery and development.

Introduction:  Drug discovery and development is a time-consuming and expensive process, which often involves high failure rates. Deep learning has been used to improve various tasks in bioinformatics and biotechnology, such as protein structure prediction and gene expression analysis. This thesis aims to investigate the use of deep learning for drug discovery and development and examine its potential to improve the efficiency and accuracy of the drug development process.

5. Comparison of deep learning and traditional machine learning methods for anomaly detection in time series data.

Introduction:  Anomaly detection in time series data is a challenging task, which is important in various fields such as finance, healthcare, and manufacturing. Deep learning methods have been used to improve anomaly detection in time series data, while traditional machine learning methods have been widely used as well. This thesis aims to compare deep learning and traditional machine learning methods for anomaly detection in time series data and examine their respective strengths and weaknesses.

data mining bachelor thesis

Photo by  Joanna Kosinska  on  Unsplash

6. Use of deep transfer learning in speech recognition and synthesis.

Introduction:  Speech recognition and synthesis are areas of natural language processing that focus on converting spoken language to text and vice versa. Transfer learning has been widely used in deep learning-based speech recognition and synthesis systems to improve their performance by reusing the features learned from other tasks. This thesis aims to investigate the use of transfer learning in speech recognition and synthesis and how it improves the performance of the system in comparison to traditional methods.

7. The use of deep learning for financial prediction.

Introduction:  Financial prediction is a challenging task that requires a high degree of intelligence and adaptability, especially in the field of stock market prediction. Deep learning has shown promising results in various financial prediction tasks, such as stock price prediction and credit risk analysis. This thesis aims to investigate the use of deep learning for financial prediction and examine its potential to improve the accuracy of financial forecasting.

8. Investigating the use of deep learning for computer vision in agriculture.

Introduction:  Computer vision has the potential to revolutionize the field of agriculture by improving crop monitoring, precision farming, and yield prediction. Deep learning has been used to improve various computer vision tasks, such as object detection, semantic segmentation, and image classification. This thesis aims to investigate the use of deep learning for computer vision in agriculture and examine its potential to improve the efficiency and accuracy of crop monitoring and precision farming.

9. Development and evaluation of deep learning models for generative design in engineering and architecture.

Introduction:  Generative design is a powerful tool in engineering and architecture that can help optimize designs and reduce human error. Deep learning has been used to improve various generative design tasks, such as design optimization and form generation. This thesis aims to develop and evaluate deep learning models for generative design in engineering and architecture and examine their potential to improve the efficiency and accuracy of the design process.

10. Investigating the use of deep learning for natural language understanding.

Introduction:  Natural language understanding is a complex task of natural language processing that involves extracting meaning from text. Deep learning has been used to improve various NLP tasks, such as machine translation, sentiment analysis, and question-answering. This thesis aims to investigate the use of deep learning for natural language understanding and examine its potential to improve the efficiency and accuracy of natural language understanding systems.

data mining bachelor thesis

Photo by  UX Indonesia  on  Unsplash

11. Comparing deep learning and traditional machine learning methods for image compression.

Introduction:  Image compression is an important task in image processing and computer vision. It enables faster data transmission and storage of image files. Deep learning methods have been used to improve image compression, while traditional machine learning methods have been widely used as well. This thesis aims to compare deep learning and traditional machine learning methods for image compression and examine their respective strengths and weaknesses.

12. Using deep learning for sentiment analysis in social media.

Introduction:  Sentiment analysis in social media is an important task that can help businesses and organizations understand their customers’ opinions and feedback. Deep learning has been used to improve sentiment analysis in social media, by training models on large datasets of social media text. This thesis aims to use deep learning for sentiment analysis in social media, and evaluate its performance against traditional machine learning methods.

13. Investigating the use of deep learning for image generation.

Introduction:  Image generation is a task in computer vision that involves creating new images from scratch or modifying existing images. Deep learning has been used to improve various image generation tasks, such as super-resolution, style transfer, and face generation. This thesis aims to investigate the use of deep learning for image generation and examine its potential to improve the quality and diversity of generated images.

14. Development and evaluation of deep learning models for anomaly detection in cybersecurity.

Introduction:  Anomaly detection in cybersecurity is an important task that can help detect and prevent cyber-attacks. Deep learning has been used to improve various anomaly detection tasks, such as intrusion detection and malware detection. This thesis aims to develop and evaluate deep learning models for anomaly detection in cybersecurity and examine their potential to improve the efficiency and accuracy of cybersecurity systems.

15. Investigating the use of deep learning for natural language summarization.

Introduction:  Natural language summarization is an important task in natural language processing that involves creating a condensed version of a text that preserves its main meaning. Deep learning has been used to improve various natural language summarization tasks, such as document summarization and headline generation. This thesis aims to investigate the use of deep learning for natural language summarization and examine its potential to improve the efficiency and accuracy of natural language summarization systems.

data mining bachelor thesis

Photo by  Windows  on  Unsplash

16. Development and evaluation of deep learning models for facial expression recognition.

Introduction:  Facial expression recognition is an important task in computer vision and has many practical applications, such as human-computer interaction, emotion recognition, and psychological studies. Deep learning has been used to improve facial expression recognition, by training models on large datasets of images. This thesis aims to develop and evaluate deep learning models for facial expression recognition and examine their performance against traditional machine learning methods.

17. Investigating the use of deep learning for generative models in music and audio.

Introduction:  Music and audio synthesis is an important task in audio processing, which has many practical applications, such as music generation and speech synthesis. Deep learning has been used to improve generative models for music and audio, by training models on large datasets of audio data. This thesis aims to investigate the use of deep learning for generative models in music and audio and examine its potential to improve the quality and diversity of generated audio.

18. Study the comparison of deep learning models with traditional algorithms for anomaly detection in network traffic.

Introduction:  Anomaly detection in network traffic is an important task that can help detect and prevent cyber-attacks. Deep learning models have been used for this task, and traditional methods such as clustering and rule-based systems are widely used as well. This thesis aims to compare deep learning models with traditional algorithms for anomaly detection in network traffic and analyze the trade-offs between the models in terms of accuracy and scalability.

19. Investigating the use of deep learning for improving recommender systems.

Introduction:  Recommender systems are widely used in many applications such as online shopping, music streaming, and movie streaming. Deep learning has been used to improve the performance of recommender systems, by training models on large datasets of user-item interactions. This thesis aims to investigate the use of deep learning for improving recommender systems and compare its performance with traditional content-based and collaborative filtering approaches.

20. Development and evaluation of deep learning models for multi-modal data analysis.

Introduction:  Multi-modal data analysis is the task of analyzing and understanding data from multiple sources such as text, images, and audio. Deep learning has been used to improve multi-modal data analysis, by training models on large datasets of multi-modal data. This thesis aims to develop and evaluate deep learning models for multi-modal data analysis and analyze their potential to improve performance in comparison to single-modal models.

I hope that this article has provided you with a useful guide for your thesis research in machine learning and deep learning. Remember to conduct a thorough literature review and to include proper citations in your work, as well as to be original in your research to avoid plagiarism. I wish you all the best of luck with your thesis and your research endeavors!

Continue Learning

Znote ai: the perfect sandbox for prototyping and deploying code, best practices for ai professional headshots: mastering your visual brand, 100 valuable chatgpt prompts to boost startups and businesses.

Spark your startups and businesses with these invaluable ChatGPT prompts.

A Path to a Greener Future with AI: Innovations and Responsibilities

Why are gpus used for ai, your local llm using fastapi.

FastAPI is a modern, fast, and easy-to-use web framework for building APIs with Python. It is based on the standard Python pointer type and supports features such as data validation, documentation…

Navigation Menu

Search code, repositories, users, issues, pull requests..., provide feedback.

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly.

To see all available qualifiers, see our documentation .

  • Notifications

Bachelor's thesis

SirongHuang/Twitter-data-mining

Folders and files, repository files navigation, twitter data mining - social media as a rising customer service channel.

Presentation slides ---- Thesis paper

Collect data from Twitter API using Python scripts that records the customer service interactions on major consumer electronic brands such as Apple and Samsung

Parse Json data objects in Python and organize tweets into customer service sessions with unique customers with Pandas

Use NLTK(Natural Language Processing toolkit) to conduct Text Mining and Sentiment Analysis

Use Naive Bayes classifier and bag of words method to predict sentiment of a customer towards the customer service received

Unfortunately my codes are forever gone with damaged computer, and I was too inexperienced to back it up on github.

Skip to Content Skip to Main Navigation Skip to Footer Skip to Search

Logo of Chair for Process and Data Science

Chair of Process and Data Science

Faculties and institutions.

You Are Here: Thesis Projects

Faculties and Institutes

  • Mathematics, Computer Science and Natural Sciences Faculty 1
  • Architecture Faculty 2
  • Civil Engineering Faculty 3
  • Mechanical Engineering Faculty 4
  • Georesources and Materials Engineering Faculty 5
  • Electrical Engineering and Information Technology Faculty 6
  • Arts and Humanities Faculty 7
  • Business and Economics Faculty 8
  • Medicine Faculty 10

Find Institute:

Institutions.

  • University Library
  • Center for Teaching and Learning Services
  • Athletics and Recreation
  • Central University Administration
  • All Institutions

You Are Here:

Thesis projects, sub-navigation.

  • PADS - Teaching
  • PADS Excellence Class
  • Pro-Seminars
  • Bachelor Thesis - Development of a Graphical User Interface Tool for Object-Centric Process Modeling
  • Master Thesis - Leveraging LLM for Translating Process Models into Text: A Study on Prompting Strategies and Quality Evaluation
  • Master Thesis - Investigating Novel Techniques for Case Reconstruction in SLURM Logs
  • Master Thesis - An Object-Centric Process Query Language
  • Bachelor Thesis - Visualizing Object-Centric Petri Nets
  • Master Thesis - Discover Medication Change Paths
  • Master Thesis - Strategic Sampling and Abstraction in Process Mining: Navigating Complexity for Cohesive Operational Insights
  • Bachelor Thesis - Analyzing Project Workflows through Git History and Gitlab Tickets
  • Bachelor Thesis - Extending the GAIA-X Eclipse Data Space Connector with an Event Logger
  • Completed Theses
  • Depth Oral Colloquium

HiWi Positions

  • New Process Mining Course by PADS and Celonis
  • Coursera Data Science in Action
  • PADS/Celonis Academic Center of Excellence

Graduates during the graduation ceremony

We are continuous looking for students interested in doing a thesis project in the PADS group. We are eager to supervise both Bachelor and Master Thesis projects. However, given the many requests, we need to be selective and align such projects to our expertise and research goals. Therefore, we require people to have a data-science mindset and an interest in processes and dynamic behavior. Most of the thesis projects are in the field of process mining. Therefore, we require potential students to show that they have an understanding of the existing field. Having process mining knowledge, e.g., obtained in the Business Process Intelligence (BPI) course at RWTH running in the second semester (SS) or the Coursera MOOC on Process Mining, is desirable. This makes it easier to discuss possible thesis projects in areas such as process discovery, conformance checking, performance analysis, predictive process analytics, automated process improvement, responsible process mining, etc. If you can still elect your courses, we also recommend the master courses Introduction to Data Science (WS) and Advanced Process Mining (SS). Also, check out the Seminars and practical assignments running every semester.

If you are interested, please look through the list of available theses on the website. If you have a good resume, good grades, and special expertise and experience in the above-mentioned area but were unable to find a suitable thesis online, please fill out the thesis inquiry form and send it to Mahsa Pourbafrani [email protected] along with a brief motivation, your CV, and grades. She is in charge of the thesis applications for the PADS group. She will ask further questions to determine whether we can provide you with a thesis that meets your requirements and expertise.

External Thesis Projects

We do NOT supervise external thesis projects unless there is an existing collaboration (e.g., with Celonis) and the topic is related to what we do (e.g., process mining). We welcome bright students who want to specialize in the topics we cover (selected topics on the interface between data science and process science, in particular, process mining). However, we only host students who know about these topics and have shown commitment to dive deeper. Moreover, we only supervise students in our expertise area, that are working on assignments that are carefully defined by us (or in a collaborative effort). Students deserve good supervision. Note that this is also the general policy of the Computer Science department . Therefore, do not be misled by groups outside of Computer Science offering such thesis projects.

Note that the chair also has many possibilities for HiWi jobs related to process mining (either within Fraunhofer FIT or RWTH). However, this is also reserved for excellent students that have acquired a background in process mining and that want to continue a career in process mining (in industry or academia). We discourage people to apply for HiWi jobs without being able to show relevant experience. HiWi opportunities typically follow (or run in parallel) with a thesis project or excellent performance in one of our courses.

last updated: 08/04/2024

  • RWTH Main Page
  • Faculty of Mathematics, Computer Science and Natural Sciences
  • Department of Computer Science
  • Contact and Maps
  • Site Credits
  • Privacy Policy
  • Accessibility Statement

Social Media

  • Fraunhofer Gesellschaft
  • Process Mining
  • Data Science Center Eindhoven

The Research Repository @ WVU

Home > Statler College of Engineering and Mineral Resources > MININGENG > Mining Engineering Graduate Theses and Dissertations

Mining Engineering Graduate Theses and Dissertations

Theses/dissertations from 2023 2023.

Development of A Hydrometallurgical Process for the Extraction of Cobalt, Manganese, and Nickel from Acid Mine Drainage Treatment Byproduct , Alejandro Agudelo Mira

Selective Recovery of Rare Earth Elements from Acid Mine Drainage Treatment Byproduct , Zeynep Cicek

Identification of Rockmass Deformation and Lithological Changes in Underground Mines by Using Slam-Based Lidar Technology , Francisco Eduardo Gil Hurtado

Analysis of the Brittle Failure Mechanism of Underground Stone Mine Pillars by Implementing Numerical Modeling in FLAC3D , Rosbel Jimenez

Analysis of the root causes of fatal injuries in the United States surface mines between 2008 and 2021. , Maria Fernanda Quintero

AUGMENTED REALITY AND MOBILE SYSTEMS FOR HEAVY EQUIPMENT OPERATORS IN SURFACE MINING , Juan David Valencia Quiceno

Theses/Dissertations from 2022 2022

Integrated Large Discontinuity Factor, Lamodel and Stability Mapping Approach for Stone Mine Pillar Stability , Mustafa Baris Ates

Noise Exposure Trends Among Violating Coal Mines, 2000 to 2021 , Hanna Grace Davis

Calcite depression in bastnaesite-calcite flotation system using organic acids , Emmy Muhoza

Investigation of Geomechanical Behavior of Laminated Rock Mass Through Experimental and Numerical Approach , Qingwen Shi

Static Liquefaction in Tailing Dams , Jose Raul Zela Concha

Experimental and Theoretical Investigation on the Initiation Mechanism of Low-Rank Coal's Self-Heating Process , Yinan Zhang

Development of an Entry-Scale Modeling Methodology to Provide Ground Reaction Curves for Longwall Gateroad Support Evaluation , Haochen Zhao

Size effect and anisotropy on the strength of shale under compressive stress conditions , Yun Zhao

Theses/Dissertations from 2021 2021

Evaluation of LIDAR systems for rock mass discontinuity identification in underground stone mines from 3D point cloud data , Mario Alejandro Bendezu de la Cruz

Implementing the Empirical Stone Mine Pillar Strength Equation into the Boundary Element Method Software LaModel , Samuel Escobar

Recovery of Phosphorus from Florida Phosphatic Waste Clay , Amir Eskanlou

Optimization of Operating Conditions and Design Parameters on Coal Ultra-Fine Grinding Through Kinetic Stirred Mill Tests and Numerical Modeling , Francisco Patino

The Effect of Natural Fractures on the Mechanical Behavior of Limestone Pillars: A Synthetic Rock Mass Approach Application , Mustafa Can Süner

Evaluation of Various Separation Techniques for the Removal of Actinides from A Rare Earth-Containing Solution Generated from Coarse Coal Refuse , Deniz Talan

Geology Oriented Loading Approach for Underground Coal Mines , Deniz Tuncay

Various Operational Aspects of the Extraction of Critical Minerals from Acid Mine Drainage and Its Treatment By-product , Zhongqing Xiao

Theses/Dissertations from 2020 2020

Adaptation of Coal Mine Floor Rating (CMFR) to Eastern U.S. Coal Mines , Sena Cicek

Upstream Tailings Dam - Liquefaction , Mladen Dragic

Development, Analysis and Case Studies of Impact Resistant Steel Sets for Underground Roof Fall Rehabilitation , Dakota D. Faulkner

The influence of spatial variance on rock strength and mechanism of failure , Danqing Gao

Fundamental Studies on the Recovery of Rare Earth Elements from Acid Mine Drainage , Xue Huang

Rational drilling control parameters to reduce respirable dust during roof bolting operations , Hua Jiang

Solutions to Some Mine Subsidence Research Challenges , Jian Yang

An Interactive Mobile Equipment Task-Training with Virtual Reality , Lazar Zujovic

Theses/Dissertations from 2019 2019

Fundamental Mechanism of Time Dependent Failure in Shale , Neel Gupta

A Critical Assessment on the Resources and Extraction of Rare Earth Elements from Acid Mine Drainage , Christopher R. Vass

Time-dependent deformation and associated failure of roof in underground mines , Yuting Xue

Theses/Dissertations from 2018 2018

Parametric Study of Coal Liberation Behavior Using Silica Grinding Media , Adewale Wasiu Adeniji

Three-dimensional Numerical Modeling Encompassing the Stability of a Vertical Gas Well Subjected to Longwall Mining Operation - A Case Study , Bonaventura Alves Mangu Bali

Shale Characterization and Size-effect study using Scanning Electron Microscopy and X-Ray Diffraction , Debashis Das

Behaviour Of Laminated Roof Under High Horizontal Stress , Prasoon Garg

Theses/Dissertations from 2017 2017

Optimization of Mineral Processing Circuit Design under Uncertainty , Seyed Hassan Amini

Evaluation of Ultrasonic Velocity Tests to Characterize Extraterrestrial Rock Masses , Thomas W. Edge II

A Photogrammetry Program for Physical Modeling of Subsurface Subsidence Process , Yujia Lian

An Area-Based Calculation of the Analysis of Roof Bolt Systems (ARBS) , Aanand Nandula

Developing and implementing new algorithms into the LaModel program for numerical analysis of multiple seam interactions , Mehdi Rajaeebaygi

Adapting Roof Support Methods for Anchoring Satellites on Asteroids , Grant B. Speer

Simulation of Venturi Tube Design for Column Flotation Using Computational Fluid Dynamics , Wan Wang

Theses/Dissertations from 2016 2016

Critical Analysis of Longwall Ventilation Systems and Removal of Methane , Robert B. Krog

Implementing the Local Mine Stiffness Calculation in LaModel , Kaifang Li

Development of Emission Factors (EFs) Model for Coal Train Loading Operations , Bisleshana Brahma Prakash

Nondestructive Methods to Characterize Rock Mechanical Properties at Low-Temperature: Applications for Asteroid Capture Technologies , Kara A. Savage

Mineral Asset Valuation Under Economic Uncertainty: A Complex System for Operational Flexibility , Marcell B. B. Silveira

A Feasibility Study for the Automated Monitoring and Control of Mine Water Discharges , Christopher R. Vass

Spontaneous Combustion of South American Coal , Brunno C. C. Vieira

Calibrating LaModel for Subsidence , Jian Yang

Theses/Dissertations from 2015 2015

Coal Quality Management Model for a Dome Storage (DS-CQMM) , Manuel Alejandro Badani Prado

Design Programs for Highwall Mining Operations , Ming Fan

Development of Drilling Control Technology to Reduce Drilling Noise during Roof Bolting Operations , Mingming Li

The Online LaModel User's & Training Manual Development & Testing , Christopher R. Newman

How to mitigate coal mine bumps through understanding the violent failure of coal specimens , Gamal Rashed

Theses/Dissertations from 2014 2014

Effect of biaxial and triaxial stresses on coal mine shale rocks , Shrey Arora

Stability Analysis of Bleeder Entries in Underground Coal Mines Using the Displacement-Discontinuity and Finite-Difference Programs , Xu Tang

Experimental and Theoretical Studies of Kinetics and Quality Parameters to Determine Spontaneous Combustion Propensity of U.S. Coals , Xinyang Wang

Bubble Size Effects in Coal Flotation and Phosphate Reverse Flotation using a Pico-nano Bubble Generator , Yu Xiong

Integrating the LaModel and ARMPS Programs (ARMPS-LAM) , Peng Zhang

Theses/Dissertations from 2013 2013

Column Flotation of Subbituminous Coal Using the Blend of Trimethyl Pentanediol Derivatives and Pico-Nano Bubbles , Jinxiang Chen

Applications of Surface and Subsurface Subsidence Theories to Solve Ground Control Problems , Biao Qiu

Calibrating the LaModel Program for Shallow Cover Multiple-Seam Mines , Morgan M. Sears

The Integration of a Coal Mine Emergency Communication Network into Pre-Mine Planning and Development , Mark F. Sindelar

Factors considered for increasing longwall panel width , Jack D. Trackemas

An experimental investigation of the creep behavior of an underground coalmine roof with shale formation , Priyesh Verma

Evaluation of Rope Shovel Operators in Surface Coal Mining Using a Multi-Attribute Decision-Making Model , Ivana M. Vukotic

Theses/Dissertations from 2012 2012

Calculating the Surface Seismic Signal from a Trapped Miner , Adeniyi A. Adebisi

Comprehensive and Integrated Model for Atmospheric Status in Sealed Underground Mine Areas , Jianwei Cheng

Production and Cost Assessment of a Potential Application of Surface Miners in Coal Mining in West Virginia , Timothy A. Nolan

The Integration of Geomorphic Design into West Virginia Surface Mine Reclamation , Alison E. Sears

Truck Cycle and Delay Automated Data Collection System (TCD-ADCS) for Surface Coal Mining , Patricio G. Terrazas Prado

New Abutment Angle Concept for Underground Coal Mining , Ihsan Berk Tulu

Theses/Dissertations from 2011 2011

Experimental analysis of the post-failure behavior of coal and rock under laboratory compression tests , Dachao Neil Nie

The influence of interface friction and w/h ratio on the violence of coal specimen failure , Simon H. Prassetyo

Theses/Dissertations from 2010 2010

A risk management approach to pillar extraction in the Central Appalachian coalfields , Patrick R. Bucks

The Impacts of Longwall Mining on Groundwater Systems -- A Case of Cumberland Mine Panels B5 and B6 , Xinzhi Du

Evaluation of ultrafine spiral concentrators for coal cleaning , Meng Yang

Theses/Dissertations from 2009 2009

Development of a coal reserve GIS model and estimation of the recoverability and extraction costs , Chandrakanth Reddy Apala

Application and evaluation of spiral separators for fine coal cleaning , Zhuping Che

Weak floor stability in the Illinois Basin underground coal mines , Murali M. Gadde

Design of reinforced concrete seals for underground coal mines , Rajagopala Reddy Kallu

Employing laboratory physical modeling to study the radio imaging method (RIM) , Jun Lu

Influence of cutting sequence and time effects on cutters and roof falls in underground coal mine -- numerical approach , Anil Kumar Ray

Implementing energy release rate calculations into the LaModel program , Morgan M. Sears

Modeling PDC cutter rock interaction , Ihsan Berk Tulu

Analytical determination of strain energy for the studies of coal mine bumps , Qiang Xu

Improvement of the mine fire simulation program MFIRE , Lihong Zhou

Theses/Dissertations from 2008 2008

Program-assisted analysis of the transverse pressure capacity of block stoppings for mine ventilation control , Timothy J. Batchler

Analysis of factors affecting wireless communication systems in underground coal mines , David P. McGraw

Analysis of underground coal mine refuge shelters , Mickey D. Mitchell

Theses/Dissertations from 2007 2007

Dolomite flotation of high magnesium phosphate ores using fatty acid soap collectors , Zhengxing Gu

Evaluation of longwall face support hydraulic supply systems , Ted M. Klemetti II

Experimental studies of electromagnetic signals to enhance radio imaging method (RIM) , William D. Monaghan

Analysis of water monitoring data for longwall panels , Joseph R. Zirkle

Theses/Dissertations from 2006 2006

Measurements of the electrical properties of coal measure rocks , Nikolay D. Boykov

Geomechanical and weathering properties of weak roof shales in coal mines , Hakan Gurgenli

Assessment and evaluation of noise controls on roof bolting equipment and a method for predicting sound pressure levels in underground coal mining , Rudy J. Matetic

  • Collections
  • Disciplines
  • WVU Libraries
  • WVU Research Office
  • WVU Research Commons
  • Open Access @ WVU
  • Digital Publishing Institute

Advanced Search

  • Notify me via email or RSS

Author Corner

Home | About | FAQ | My Account | Accessibility Statement

Privacy Copyright

IMAGES

  1. Data-Mining Thesis Proposal

    data mining bachelor thesis

  2. Data Mining Thesis Ideas

    data mining bachelor thesis

  3. Trending Top 10 Data Mining Thesis Topics [How to Choose Novel Idea]

    data mining bachelor thesis

  4. Latest Data Mining Research and Thesis Topic Guidance For M.Tech and

    data mining bachelor thesis

  5. Data Mining Thesis Writing Services

    data mining bachelor thesis

  6. MTech Thesis In Data Mining

    data mining bachelor thesis

VIDEO

  1. How to write thesis for Bachelor/Master/M.Phil/PhD

  2. Bachelor Thesis: Lightmap Generation Tool

  3. Introduction to Data science lecture 1 Part 3

  4. Data mining masters thesis ! Post graduation research thesis

  5. Introduction to Data Science Lecture 1

  6. Introduction to Data Science

COMMENTS

  1. PDF The application of data mining methods

    This thesis first introduces the basic concepts of data mining, such as the definition of data mining, its basic function, common methods and basic process, and two common data mining methods, classification and clustering. Then a data mining application in network is discussed in detail, followed by a brief introduction on data mining ...

  2. PDF Big data mining

    3 DATA Bachelor Thesis 2020/2021 - Richie Lee 2Literature The foundation of this thesis, Mining big data using parsimonious factor, machine learning, vari-able selection and shrinkage methods byKim and Swanson(2018) focuses on the usefulness of factor models in the context of prediction using big data. In particular, this research examines perfor-

  3. Open Theses

    Open Topics We offer multiple Bachelor/Master theses, Guided Research projects and IDPs in the area of data mining/machine learning. A non-exhaustive list of open topics is listed below.. If you are interested in a thesis or a guided research project, please send your CV and transcript of records to Prof. Stephan Günnemann via email and we will arrange a meeting to talk about the potential ...

  4. Data Mining

    Data Mining. Data Science; Data and Artificial Intelligence; Overview; Fingerprint; Network; Researchers (46) Projects (2) Research output (630) Datasets (4) Prizes (17) ... Student thesis: Bachelor. File. A Deep Learning Approach for Clustering a Multi-Class Dataset Author: Kamat, V., 23 Jan 2023.

  5. PDF DATA PREPROCESSING FOR DATA MINING

    1 INTRODUCTION TO DATA MINING 5 1.1 Background 5 1.2 Definition 6 1.3 Data Source 7 1.4 Application 8 1.5 Challenges 10 2 RELATED TECHNIQUES VS DATA MINING 12 2.1 Data warehouse 12 2.2 Online analytical processing 13 2.3 Statistics and Machine Learning 14 3 WORKING THEORY OF DATA MINING 16 3.1 Task 16 3.2 Process 18 3.3 Data preprocessing 20

  6. PDF Data mining using open source software for small business Including

    Bachelor Thesis Degree Programme in BIT 2016. Abstract 3th Mai 2015 Authors Antoine Dubuis The title of your thesis Data mining using open source software for small business Number of pages and ap-pendices 63+4 Supervisors Arvo Lipitsainen - Advisor from Haaga-Helia University of Applied Sciences

  7. Data Mining research of bachelor's degree dissertation of College of

    This study tries to find out some rules and characteristics through the data mining of students' dissertation topics, such as the College of Electronic and Electrical Engineering, dissertation of undergraduate students majoring in Electrical Engineering and Automation in the past three years. The study employs the quantitative method of research. In this study, we will use the process of K-DD ...

  8. Application of Data Mining Methods for Customer Clustering

    The motivation behind this thesis is to investigate the value of clustering in the machine learning/data mining context for customer segmentation. Classical database marketing methods are combined with data mining tools. Data mining techniques can be used to create the segments automatically.

  9. PDF A MINING TECHNIQUES F UCTURED AND

    ery (data mining) from this data has b ecome v ery imp ortan t for the business and scien ti c-researc h comm unities alik e. This do ctoral thesis in tro duces Query Flo c ks, a general framew ork o v er relational data that enables the declarativ e form ulation, systematic optimization, and e cien t pro cessing of a large class of mining ...

  10. PDF Hash-based Approach to Data Mining

    My thesis, with the subject "hash-based approach to data mining" focuses on the hash-based method to improve performance of finding association rules in the transaction databases and use the PHS (perfect hashing and data shrinking) algorithm to build a system, which helps directors of shops/stores to have a detailed view about his business.

  11. data mining Latest Research Papers

    The accurate average value is 74.05% of the existing COID algorithm, and our proposed algorithm has 77.21%. The average recall value is 81.19% and 89.51% of the existing and proposed algorithm, which shows that the proposed work efficiency is better than the existing COID algorithm. Download Full-text.

  12. PDF Use case based introduction to process mining and current tools

    This thesis will introduce the most important process mining techniques and apply them to uses cases that are based on real life event data. Three process mining tools, ProM, Disco and Celonis, will be introduced and used to apply process mining techniques. 1. Introduction.

  13. PDF A REVIEW OF DATA MINING IN BIOINFORMATICS

    The aim of this bachelor's thesis is to highlight and discuss in detail the application of data mining techniques in bioinformatics. It begins by discussing the interdisciplinary relationship between data mining, knowledge discovery and bioinformatics before a comprehensive descriptive research in data

  14. Bachelor and Master Theses

    Ziqiu Zhou: Semantic Extensions of OSM Data Through Mining Tweets in the Domain of Disaster Management, Master Thesis, May 2022. Lukas Ballweg: Analysis of Lobby Networks and their Extraction from Semi-Structured Data, Bachelor Thesis, April 2022.

  15. Bachelor and Master Thesis : Professorship of Data Science

    Bachelor and Master Thesis. We offer a variety of cutting-edge and exciting research topics for Bachelor's and Master's theses. We cover a wide range of topics from Data Science, Natural Language Processing, Argument Mining, the Use of AI in Business, Ethics in AI and Multimodal AI. We are always open to suggestions for your own topics, so ...

  16. Bachelor and Master Thesis

    Our team offers bachelor and master thesis topics as well as student projects to be written in English. ... Actionable Recommendation for Learner in Learning Management System based on Process Mining (Bachelor/Master) ... L.M., Nat, M. Student Engagement Patterns in a Blended Learning Environment: an Educational Data Mining Approach. TechTrends ...

  17. The Future of AI Research: 20 Thesis Ideas for Undergraduate ...

    This article provides a list of 20 potential thesis ideas for an undergraduate program in machine learning and deep learning in 2023. Each thesis idea includes an introduction, which presents a brief overview of the topic and the research objectives. The ideas provided are related to different areas of machine learning and deep learning, such ...

  18. PDF Data mining in medical diagnostic support system

    Bachelor's Thesis Degree Programme in BIT 2019 . Abstract Date: 9 May 2019 Author(s) Khoa Nguyen Degree programme Report/thesis title Data mining in medical diagnostic support system Number of pages and appendix pages 45 + 5 The health and education are always a vital issue for any countries in the world. ... Data mining is a technology based ...

  19. PDF Bachelor Thesis A machine learning approach to enhance the ...

    Bachelor Thesis A machine learning approach to enhance the privacy of customers En maskininärningsmetod för ökad kundintegritet. Jesper Anderberg ... The report also examines how data mining is affected in the context of private information. In the first case, the authors collected biometric samples from 200 people. According to

  20. GitHub

    Twitter data mining - social media as a rising customer service channel. Presentation slides ---- Thesis paper. Collect data from Twitter API using Python scripts that records the customer service interactions on major consumer electronic brands such as Apple and Samsung. Parse Json data objects in Python and organize tweets into customer ...

  21. Thesis Projects

    Thesis Projects. We are continuous looking for students interested in doing a thesis project in the PADS group. We are eager to supervise both Bachelor and Master Thesis projects. However, given the many requests, we need to be selective and align such projects to our expertise and research goals. Therefore, we require people to have a data ...

  22. PDF Data Mining Thesis Topics in Finland

    Bachelor of Engineering Information Technology Thesis 5 May 2017. Abstract Author Title Number of Pages Date Ari Bajo Rouvinen Data Mining Thesis Topics in Finland 46 pages ... This thesis is based on data mining the Theseus dataset. This dataset is maintained by Arene Ry [1], the Rector's Conference of Finnish Universities of Applied ...

  23. Mining Engineering Graduate Theses and Dissertations

    Truck Cycle and Delay Automated Data Collection System (TCD-ADCS) for Surface Coal Mining, Patricio G. Terrazas Prado. PDF. New Abutment Angle Concept for Underground Coal Mining, Ihsan Berk Tulu. Theses/Dissertations from 2011 PDF. Experimental analysis of the post-failure behavior of coal and rock under laboratory compression tests, Dachao ...