• How It Works
  • PhD thesis writing
  • Master thesis writing
  • Bachelor thesis writing
  • Dissertation writing service
  • Dissertation abstract writing
  • Thesis proposal writing
  • Thesis editing service
  • Thesis proofreading service
  • Thesis formatting service
  • Coursework writing service
  • Research paper writing service
  • Architecture thesis writing
  • Computer science thesis writing
  • Engineering thesis writing
  • History thesis writing
  • MBA thesis writing
  • Nursing dissertation writing
  • Psychology dissertation writing
  • Sociology thesis writing
  • Statistics dissertation writing
  • Buy dissertation online
  • Write my dissertation
  • Cheap thesis
  • Cheap dissertation
  • Custom dissertation
  • Dissertation help
  • Pay for thesis
  • Pay for dissertation
  • Senior thesis
  • Write my thesis

214 Best Big Data Research Topics for Your Thesis Paper

big data research topics

Finding an ideal big data research topic can take you a long time. Big data, IoT, and robotics have evolved. The future generations will be immersed in major technologies that will make work easier. Work that was done by 10 people will now be done by one person or a machine. This is amazing because, in as much as there will be job loss, more jobs will be created. It is a win-win for everyone.

Big data is a major topic that is being embraced globally. Data science and analytics are helping institutions, governments, and the private sector. We will share with you the best big data research topics.

On top of that, we can offer you the best writing tips to ensure you prosper well in your academics. As students in the university, you need to do proper research to get top grades. Hence, you can consult us if in need of research paper writing services.

Big Data Analytics Research Topics for your Research Project

Are you looking for an ideal big data analytics research topic? Once you choose a topic, consult your professor to evaluate whether it is a great topic. This will help you to get good grades.

  • Which are the best tools and software for big data processing?
  • Evaluate the security issues that face big data.
  • An analysis of large-scale data for social networks globally.
  • The influence of big data storage systems.
  • The best platforms for big data computing.
  • The relation between business intelligence and big data analytics.
  • The importance of semantics and visualization of big data.
  • Analysis of big data technologies for businesses.
  • The common methods used for machine learning in big data.
  • The difference between self-turning and symmetrical spectral clustering.
  • The importance of information-based clustering.
  • Evaluate the hierarchical clustering and density-based clustering application.
  • How is data mining used to analyze transaction data?
  • The major importance of dependency modeling.
  • The influence of probabilistic classification in data mining.

Interesting Big Data Analytics Topics

Who said big data had to be boring? Here are some interesting big data analytics topics that you can try. They are based on how some phenomena are done to make the world a better place.

  • Discuss the privacy issues in big data.
  • Evaluate the storage systems of scalable in big data.
  • The best big data processing software and tools.
  • Data mining tools and techniques are popularly used.
  • Evaluate the scalable architectures for parallel data processing.
  • The major natural language processing methods.
  • Which are the best big data tools and deployment platforms?
  • The best algorithms for data visualization.
  • Analyze the anomaly detection in cloud servers
  • The scrutiny normally done for the recruitment of big data job profiles.
  • The malicious user detection in big data collection.
  • Learning long-term dependencies via the Fourier recurrent units.
  • Nomadic computing for big data analytics.
  • The elementary estimators for graphical models.
  • The memory-efficient kernel approximation.

Big Data Latest Research Topics

Do you know the latest research topics at the moment? These 15 topics will help you to dive into interesting research. You may even build on research done by other scholars.

  • Evaluate the data mining process.
  • The influence of the various dimension reduction methods and techniques.
  • The best data classification methods.
  • The simple linear regression modeling methods.
  • Evaluate the logistic regression modeling.
  • What are the commonly used theorems?
  • The influence of cluster analysis methods in big data.
  • The importance of smoothing methods analysis in big data.
  • How is fraud detection done through AI?
  • Analyze the use of GIS and spatial data.
  • How important is artificial intelligence in the modern world?
  • What is agile data science?
  • Analyze the behavioral analytics process.
  • Semantic analytics distribution.
  • How is domain knowledge important in data analysis?

Big Data Debate Topics

If you want to prosper in the field of big data, you need to try even hard topics. These big data debate topics are interesting and will help you to get a better understanding.

  • The difference between big data analytics and traditional data analytics methods.
  • Why do you think the organization should think beyond the Hadoop hype?
  • Does the size of the data matter more than how recent the data is?
  • Is it true that bigger data are not always better?
  • The debate of privacy and personalization in maintaining ethics in big data.
  • The relation between data science and privacy.
  • Do you think data science is a rebranding of statistics?
  • Who delivers better results between data scientists and domain experts?
  • According to your view, is data science dead?
  • Do you think analytics teams need to be centralized or decentralized?
  • The best methods to resource an analytics team.
  • The best business case for investing in analytics.
  • The societal implications of the use of predictive analytics within Education.
  • Is there a need for greater control to prevent experimentation on social media users without their consent?
  • How is the government using big data; for the improvement of public statistics or to control the population?

University Dissertation Topics on Big Data

Are you doing your Masters or Ph.D. and wondering the best dissertation topic or thesis to do? Why not try any of these? They are interesting and based on various phenomena. While doing the research ensure you relate the phenomenon with the current modern society.

  • The machine learning algorithms are used for fall recognition.
  • The divergence and convergence of the internet of things.
  • The reliable data movements using bandwidth provision strategies.
  • How is big data analytics using artificial neural networks in cloud gaming?
  • How is Twitter accounts classification done using network-based features?
  • How is online anomaly detection done in the cloud collaborative environment?
  • Evaluate the public transportation insights provided by big data.
  • Evaluate the paradigm for cancer patients using the nursing EHR to predict the outcome.
  • Discuss the current data lossless compression in the smart grid.
  • How does online advertising traffic prediction helps in boosting businesses?
  • How is the hyperspectral classification done using the multiple kernel learning paradigm?
  • The analysis of large data sets downloaded from websites.
  • How does social media data help advertising companies globally?
  • Which are the systems recognizing and enforcing ownership of data records?
  • The alternate possibilities emerging for edge computing.

The Best Big Data Analysis Research Topics and Essays

There are a lot of issues that are associated with big data. Here are some of the research topics that you can use in your essays. These topics are ideal whether in high school or college.

  • The various errors and uncertainty in making data decisions.
  • The application of big data on tourism.
  • The automation innovation with big data or related technology
  • The business models of big data ecosystems.
  • Privacy awareness in the era of big data and machine learning.
  • The data privacy for big automotive data.
  • How is traffic managed in defined data center networks?
  • Big data analytics for fault detection.
  • The need for machine learning with big data.
  • The innovative big data processing used in health care institutions.
  • The money normalization and extraction from texts.
  • How is text categorization done in AI?
  • The opportunistic development of data-driven interactive applications.
  • The use of data science and big data towards personalized medicine.
  • The programming and optimization of big data applications.

The Latest Big Data Research Topics for your Research Proposal

Doing a research proposal can be hard at first unless you choose an ideal topic. If you are just diving into the big data field, you can use any of these topics to get a deeper understanding.

  • The data-centric network of things.
  • Big data management using artificial intelligence supply chain.
  • The big data analytics for maintenance.
  • The high confidence network predictions for big biological data.
  • The performance optimization techniques and tools for data-intensive computation platforms.
  • The predictive modeling in the legal context.
  • Analysis of large data sets in life sciences.
  • How to understand the mobility and transport modal disparities sing emerging data sources?
  • How do you think data analytics can support asset management decisions?
  • An analysis of travel patterns for cellular network data.
  • The data-driven strategic planning for citywide building retrofitting.
  • How is money normalization done in data analytics?
  • Major techniques used in data mining.
  • The big data adaptation and analytics of cloud computing.
  • The predictive data maintenance for fault diagnosis.

Interesting Research Topics on A/B Testing In Big Data

A/B testing topics are different from the normal big data topics. However, you use an almost similar methodology to find the reasons behind the issues. These topics are interesting and will help you to get a deeper understanding.

  • How is ultra-targeted marketing done?
  • The transition of A/B testing from digital to offline.
  • How can big data and A/B testing be done to win an election?
  • Evaluate the use of A/B testing on big data
  • Evaluate A/B testing as a randomized control experiment.
  • How does A/B testing work?
  • The mistakes to avoid while conducting the A/B testing.
  • The most ideal time to use A/B testing.
  • The best way to interpret results for an A/B test.
  • The major principles of A/B tests.
  • Evaluate the cluster randomization in big data
  • The best way to analyze A/B test results and the statistical significance.
  • How is A/B testing used in boosting businesses?
  • The importance of data analysis in conversion research
  • The importance of A/B testing in data science.

Amazing Research Topics on Big Data and Local Governments

Governments are now using big data to make the lives of the citizens better. This is in the government and the various institutions. They are based on real-life experiences and making the world better.

  • Assess the benefits and barriers of big data in the public sector.
  • The best approach to smart city data ecosystems.
  • The big analytics used for policymaking.
  • Evaluate the smart technology and emergence algorithm bureaucracy.
  • Evaluate the use of citizen scoring in public services.
  • An analysis of the government administrative data globally.
  • The public values are found in the era of big data.
  • Public engagement on local government data use.
  • Data analytics use in policymaking.
  • How are algorithms used in public sector decision-making?
  • The democratic governance in the big data era.
  • The best business model innovation to be used in sustainable organizations.
  • How does the government use the collected data from various sources?
  • The role of big data for smart cities.
  • How does big data play a role in policymaking?

Easy Research Topics on Big Data

Who said big data topics had to be hard? Here are some of the easiest research topics. They are based on data management, research, and data retention. Pick one and try it!

  • Who uses big data analytics?
  • Evaluate structure machine learning.
  • Explain the whole deep learning process.
  • Which are the best ways to manage platforms for enterprise analytics?
  • Which are the new technologies used in data management?
  • What is the importance of data retention?
  • The best way to work with images is when doing research.
  • The best way to promote research outreach is through data management.
  • The best way to source and manage external data.
  • Does machine learning improve the quality of data?
  • Describe the security technologies that can be used in data protection.
  • Evaluate token-based authentication and its importance.
  • How can poor data security lead to the loss of information?
  • How to determine secure data.
  • What is the importance of centralized key management?

Unique IoT and Big Data Research Topics

Internet of Things has evolved and many devices are now using it. There are smart devices, smart cities, smart locks, and much more. Things can now be controlled by the touch of a button.

  • Evaluate the 5G networks and IoT.
  • Analyze the use of Artificial intelligence in the modern world.
  • How do ultra-power IoT technologies work?
  • Evaluate the adaptive systems and models at runtime.
  • How have smart cities and smart environments improved the living space?
  • The importance of the IoT-based supply chains.
  • How does smart agriculture influence water management?
  • The internet applications naming and identifiers.
  • How does the smart grid influence energy management?
  • Which are the best design principles for IoT application development?
  • The best human-device interactions for the Internet of Things.
  • The relation between urban dynamics and crowdsourcing services.
  • The best wireless sensor network for IoT security.
  • The best intrusion detection in IoT.
  • The importance of big data on the Internet of Things.

Big Data Database Research Topics You Should Try

Big data is broad and interesting. These big data database research topics will put you in a better place in your research. You also get to evaluate the roles of various phenomena.

  • The best cloud computing platforms for big data analytics.
  • The parallel programming techniques for big data processing.
  • The importance of big data models and algorithms in research.
  • Evaluate the role of big data analytics for smart healthcare.
  • How is big data analytics used in business intelligence?
  • The best machine learning methods for big data.
  • Evaluate the Hadoop programming in big data analytics.
  • What is privacy-preserving to big data analytics?
  • The best tools for massive big data processing
  • IoT deployment in Governments and Internet service providers.
  • How will IoT be used for future internet architectures?
  • How does big data close the gap between research and implementation?
  • What are the cross-layer attacks in IoT?
  • The influence of big data and smart city planning in society.
  • Why do you think user access control is important?

Big Data Scala Research Topics

Scala is a programming language that is used in data management. It is closely related to other data programming languages. Here are some of the best scala questions that you can research.

  • Which are the most used languages in big data?
  • How is scala used in big data research?
  • Is scala better than Java in big data?
  • How is scala a concise programming language?
  • How does the scala language stream process in real-time?
  • Which are the various libraries for data science and data analysis?
  • How does scala allow imperative programming in data collection?
  • Evaluate how scala includes a useful REPL for interaction.
  • Evaluate scala’s IDE support.
  • The data catalog reference model.
  • Evaluate the basics of data management and its influence on research.
  • Discuss the behavioral analytics process.
  • What can you term as the experience economy?
  • The difference between agile data science and scala language.
  • Explain the graph analytics process.

Independent Research Topics for Big Data

These independent research topics for big data are based on the various technologies and how they are related. Big data will greatly be important for modern society.

  • The biggest investment is in big data analysis.
  • How are multi-cloud and hybrid settings deep roots?
  • Why do you think machine learning will be in focus for a long while?
  • Discuss in-memory computing.
  • What is the difference between edge computing and in-memory computing?
  • The relation between the Internet of things and big data.
  • How will digital transformation make the world a better place?
  • How does data analysis help in social network optimization?
  • How will complex big data be essential for future enterprises?
  • Compare the various big data frameworks.
  • The best way to gather and monitor traffic information using the CCTV images
  • Evaluate the hierarchical structure of groups and clusters in the decision tree.
  • Which are the 3D mapping techniques for live streaming data.
  • How does machine learning help to improve data analysis?
  • Evaluate DataStream management in task allocation.
  • How is big data provisioned through edge computing?
  • The model-based clustering of texts.
  • The best ways to manage big data.
  • The use of machine learning in big data.

Is Your Big Data Thesis Giving You Problems?

These are some of the best topics that you can use to prosper in your studies. Not only are they easy to research but also reflect on real-time issues. Whether in University or college, you need to put enough effort into your studies to prosper. However, if you have time constraints, we can provide professional writing help. Are you looking for online expert writers? Look no further, we will provide quality work at a cheap price.

rhetorical analysis topics

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Comment * Error message

Name * Error message

Email * Error message

Save my name, email, and website in this browser for the next time I comment.

As Putin continues killing civilians, bombing kindergartens, and threatening WWIII, Ukraine fights for the world's peaceful future.

Ukraine Live Updates

Grad Coach

Research Topics & Ideas: Data Science

50 Topic Ideas To Kickstart Your Research Project

Research topics and ideas about data science and big data analytics

If you’re just starting out exploring data science-related topics for your dissertation, thesis or research project, you’ve come to the right place. In this post, we’ll help kickstart your research by providing a hearty list of data science and analytics-related research ideas , including examples from recent studies.

PS – This is just the start…

We know it’s exciting to run through a list of research topics, but please keep in mind that this list is just a starting point . These topic ideas provided here are intentionally broad and generic , so keep in mind that you will need to develop them further. Nevertheless, they should inspire some ideas for your project.

To develop a suitable research topic, you’ll need to identify a clear and convincing research gap , and a viable plan to fill that gap. If this sounds foreign to you, check out our free research topic webinar that explores how to find and refine a high-quality research topic, from scratch. Alternatively, consider our 1-on-1 coaching service .

Research topic idea mega list

Data Science-Related Research Topics

  • Developing machine learning models for real-time fraud detection in online transactions.
  • The use of big data analytics in predicting and managing urban traffic flow.
  • Investigating the effectiveness of data mining techniques in identifying early signs of mental health issues from social media usage.
  • The application of predictive analytics in personalizing cancer treatment plans.
  • Analyzing consumer behavior through big data to enhance retail marketing strategies.
  • The role of data science in optimizing renewable energy generation from wind farms.
  • Developing natural language processing algorithms for real-time news aggregation and summarization.
  • The application of big data in monitoring and predicting epidemic outbreaks.
  • Investigating the use of machine learning in automating credit scoring for microfinance.
  • The role of data analytics in improving patient care in telemedicine.
  • Developing AI-driven models for predictive maintenance in the manufacturing industry.
  • The use of big data analytics in enhancing cybersecurity threat intelligence.
  • Investigating the impact of sentiment analysis on brand reputation management.
  • The application of data science in optimizing logistics and supply chain operations.
  • Developing deep learning techniques for image recognition in medical diagnostics.
  • The role of big data in analyzing climate change impacts on agricultural productivity.
  • Investigating the use of data analytics in optimizing energy consumption in smart buildings.
  • The application of machine learning in detecting plagiarism in academic works.
  • Analyzing social media data for trends in political opinion and electoral predictions.
  • The role of big data in enhancing sports performance analytics.
  • Developing data-driven strategies for effective water resource management.
  • The use of big data in improving customer experience in the banking sector.
  • Investigating the application of data science in fraud detection in insurance claims.
  • The role of predictive analytics in financial market risk assessment.
  • Developing AI models for early detection of network vulnerabilities.

Research topic evaluator

Data Science Research Ideas (Continued)

  • The application of big data in public transportation systems for route optimization.
  • Investigating the impact of big data analytics on e-commerce recommendation systems.
  • The use of data mining techniques in understanding consumer preferences in the entertainment industry.
  • Developing predictive models for real estate pricing and market trends.
  • The role of big data in tracking and managing environmental pollution.
  • Investigating the use of data analytics in improving airline operational efficiency.
  • The application of machine learning in optimizing pharmaceutical drug discovery.
  • Analyzing online customer reviews to inform product development in the tech industry.
  • The role of data science in crime prediction and prevention strategies.
  • Developing models for analyzing financial time series data for investment strategies.
  • The use of big data in assessing the impact of educational policies on student performance.
  • Investigating the effectiveness of data visualization techniques in business reporting.
  • The application of data analytics in human resource management and talent acquisition.
  • Developing algorithms for anomaly detection in network traffic data.
  • The role of machine learning in enhancing personalized online learning experiences.
  • Investigating the use of big data in urban planning and smart city development.
  • The application of predictive analytics in weather forecasting and disaster management.
  • Analyzing consumer data to drive innovations in the automotive industry.
  • The role of data science in optimizing content delivery networks for streaming services.
  • Developing machine learning models for automated text classification in legal documents.
  • The use of big data in tracking global supply chain disruptions.
  • Investigating the application of data analytics in personalized nutrition and fitness.
  • The role of big data in enhancing the accuracy of geological surveying for natural resource exploration.
  • Developing predictive models for customer churn in the telecommunications industry.
  • The application of data science in optimizing advertisement placement and reach.

Recent Data Science-Related Studies

While the ideas we’ve presented above are a decent starting point for finding a research topic, they are fairly generic and non-specific. So, it helps to look at actual studies in the data science and analytics space to see how this all comes together in practice.

Below, we’ve included a selection of recent studies to help refine your thinking. These are actual studies,  so they can provide some useful insight as to what a research topic looks like in practice.

  • Data Science in Healthcare: COVID-19 and Beyond (Hulsen, 2022)
  • Auto-ML Web-application for Automated Machine Learning Algorithm Training and evaluation (Mukherjee & Rao, 2022)
  • Survey on Statistics and ML in Data Science and Effect in Businesses (Reddy et al., 2022)
  • Visualization in Data Science VDS @ KDD 2022 (Plant et al., 2022)
  • An Essay on How Data Science Can Strengthen Business (Santos, 2023)
  • A Deep study of Data science related problems, application and machine learning algorithms utilized in Data science (Ranjani et al., 2022)
  • You Teach WHAT in Your Data Science Course?!? (Posner & Kerby-Helm, 2022)
  • Statistical Analysis for the Traffic Police Activity: Nashville, Tennessee, USA (Tufail & Gul, 2022)
  • Data Management and Visual Information Processing in Financial Organization using Machine Learning (Balamurugan et al., 2022)
  • A Proposal of an Interactive Web Application Tool QuickViz: To Automate Exploratory Data Analysis (Pitroda, 2022)
  • Applications of Data Science in Respective Engineering Domains (Rasool & Chaudhary, 2022)
  • Jupyter Notebooks for Introducing Data Science to Novice Users (Fruchart et al., 2022)
  • Towards a Systematic Review of Data Science Programs: Themes, Courses, and Ethics (Nellore & Zimmer, 2022)
  • Application of data science and bioinformatics in healthcare technologies (Veeranki & Varshney, 2022)
  • TAPS Responsibility Matrix: A tool for responsible data science by design (Urovi et al., 2023)
  • Data Detectives: A Data Science Program for Middle Grade Learners (Thompson & Irgens, 2022)
  • MACHINE LEARNING FOR NON-MAJORS: A WHITE BOX APPROACH (Mike & Hazzan, 2022)
  • COMPONENTS OF DATA SCIENCE AND ITS APPLICATIONS (Paul et al., 2022)
  • Analysis on the Application of Data Science in Business Analytics (Wang, 2022)

As you can see, these research topics are a lot more focused than the generic topic ideas we presented earlier. So, for you to develop a high-quality research topic, you’ll need to get specific and laser-focused on a specific context with specific variables of interest.  In the video below, we explore some other important things you’ll need to consider when crafting your research topic.

Get 1-On-1 Help

If you’re still unsure about how to find a quality research topic, check out our Research Topic Kickstarter service, which is the perfect starting point for developing a unique, well-justified research topic.

Research Topic Kickstarter - Need Help Finding A Research Topic?

You Might Also Like:

IT & Computer Science Research Topics

Submit a Comment Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

  • Print Friendly

Thesis Helpers

thesis topics about big data

Find the best tips and advice to improve your writing. Or, have a top expert write your paper.

Top 100 Big Data Research Topics For Students

big data research topics

Selecting the right big data research topics is the first and most important step in the process of writing academic papers or essays. Big data is becoming a popular phenomenon among scholars and practitioners. The multidisciplinary background of big data research encompasses a wide spectrum that covers scientific publications in different study areas.

Nevertheless, some students have difficulties choosing big data topics for their computer science thesis or research paper. That’s because finding information to write about some topics is not easy. To solve this problem, we list the top 100 topics in data science that learners can choose from.

Trendy Big Data Research Topics

Students that want to focus on emerging issues when writing academic papers and essays should choose trendy data science topics. Big data covers the initiatives and technologies that tackle massive and diverse data when it comes to addressing traditional skills, technologies, and infrastructure efficiently. Here are some of the latest data topics to consider when writing a research paper or essay.

  • Tools and software for processing big data
  • Privacy and security issues that face big data
  • Scalable architectures for processing massively parallel data
  • Analyzing large scale data for social networks
  • Scalable big data storage systems
  • Platforms for big data computing- Big data analytics and adoption
  • How to analyze big data
  • How to effectively manage big data
  • Parallel big data programming and processing techniques
  • Semantics in big data
  • Visualization of big data
  • Business intelligence and big data analytics
  • Map-reduce architecture and Hadoop programming
  • Methods for machine learning in big data
  • Big data analytics and privacy preservation
  • How to process stream data in big data
  • Uncertainty in big data management
  • Anomaly detection in large scale data systems
  • Analytics for big data in the Smart Healthcare systems
  • The importance of big data technologies for modern businesses

These are great data research topics that learners at different study levels should consider when asked to write academic papers or essays. However, extensive research is required to come up with great write-ups on these topics.

Data Mining Research Topics for Students

Data mining refers to the extraction of useful information from raw data. It’s a technique that companies apply to accomplish tasks like prediction analysis, generation of the association rule, and clustering. Data mining topics can explain this technique or address issues that are associated with it. Here are some of the best data mining project topics that learners can consider.

  • Big data mining techniques and tools
  • Model-based clustering of texts
  • Describe the concept of data spectroscopic clustering
  • Parallel spectral clustering within a distributed system
  • Describe asymmetrical spectral clustering
  • What is information-based clustering?
  • Self-turning spectral clustering
  • Symmetrical spectral clustering
  • Discuss the K-Means algorithms in data clustering
  • Discuss the package of MATLAB spectral clustering
  • Discuss the K-Means clustering from an online spherical perspective
  • Discuss the hierarchical clustering application
  • Explain the importance of probabilistic classification in data mining
  • How can the effectiveness of nonlinear and linear regression analysis be improved?
  • Explain the Association Rule Learning regarding data mining
  • Explain the performance of dependency modeling
  • Discuss the performance of representative-based clustering
  • Explain the need for density-based clustering
  • Discuss the importance of subject-based data mining when it comes to reducing terrorism
  • How can data mining be used to analyze transaction data in a supermarket?

Most data mining current research topics focus on finding or establishing patterns. Students can even find some of the best data mining case study topics in this category. Nevertheless, every idea requires detailed and extensive research to come up with facts that make a great paper or essay.

Big Data Analysis Topics

The moderns IT industry depends on data analytics as its lifeline. Big data is one of the techniques and technologies that are used to analyze vast data volumes. The industry is using data analytics as a strategy for gaining insights into system performance and customer behavior. Here are some of the best data analytics research topics that students can consider when writing academic papers.

  • Internet of Things
  • Describe the importance of augmented reality
  • How important is artificial intelligence?
  • Explain the graph analytics process
  • What is agile data science?
  • Why is machine intelligence for modern businesses?
  • What is hyper-personalization?
  • Explain the behavioral analytics process
  • What is the experience economy?
  • Discuss journey sciences
  • Discuss knowledge validation and extraction
  • What is semantic data management?
  • Explain the deep learning process
  • Explain software engineering for big data science
  • What is structured machine learning?
  • Explain semantic question answering
  • What is distributed semantic analytics?
  • Why is domain knowledge important in data analysis?
  • Why is data exploration important in data analysis?
  • Who uses big data analytics?

Writing about data analytics topics requires background knowledge of the issues being discussed. That’s because the analysis entails harnessing data and extracting its value.

Data Management Project Topics

This category has some of the best data science research topics. The enormous amount of data that modern organizations have to deal with every day is not easy to handle. As such, its effective management is required to ensure its effective use. Here are some of the best topics that students can write about in this aspect.

  • Describe some of the most innovative bid data management concepts
  • Data catalogs: Describe approaches and their implementation, as well as, adoption
  • How to manage platforms for enterprise analytics
  • Discuss the impact of data quality on a business
  • Explain the best data management strategies for modern enterprises
  • New technologies and AI in data management
  • What is data retention and why is it important?
  • Describe the basics of data management
  • Explain the application of data management basics
  • Data publishing and access by modern companies
  • Explain the process of analyzing and managing data for reproducible research
  • Explain how to work with images during research
  • How can an organization ensure secure and confidential handling and management of data?
  • How to promote research and scientific outreach through data management
  • How to source and manage external data
  • How to ensure effective data protection through proper management
  • Data catalog reference model and market study
  • What is data valuation and why does it matter in data management?
  • How can machine learning improve the data quality?
  • How can a company implement data governance?

This category also has some of the best big data seminar topics. That’s because some of the ideas featured in this section are about issues that affect almost every organization.

Resent Data Security Topics for Research

Big data that comes from different computers and devices require security. That’s because such data is vulnerable to different cyber threats. Some of the best research topics in this category include the following.

  • How changing data from Terabytes to Petabytes affects its security
  • What are the major vulnerabilities for big data?
  • Why big data owners should update security measures regularly
  • How can poor data security lead to loss of important information
  • Describe security technologies that can be used to protect big data
  • Explain how Hadoop integrates with modern security tools
  • Which are the best encryption tools for protecting transit data?
  • Explain how data encryption tools work
  • What is token-based authentication?
  • Explain how intrusion prevention and detection systems work
  • What are the most effective physical systems for securing data?
  • Which is the best intrusion detection system?
  • Describe the most suitable key management system when it comes to processing massive data
  • Which tool or algorithm can be used for data owner and user’s authentication?
  • Explain how you can determine the amount of secure data
  • How to identify a legit data user
  • How to prevent illegitimate data access
  • How to implement attribute-access or role-based access control
  • Explain the importance of centralized key management
  • Why is user-access control important?

Any topic in this category can be used to write a brilliant paper or essay that will earn the learner the top grade. However, time and efforts are required to work on these ideas.

Whether students opt to write about data visualization topics or data structure research topics, the most important thing is to choose ideas they like and find interesting. What’s more, learners should pick topics they can find adequate information for online. That way, they will find the research and writing process enjoyable. They can also buy dissertations or any other academic papers that will impress educators to award them the top grades.

qualitative research topics

Make PhD experience your own

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

eml header

37 Research Topics In Data Science To Stay On Top Of

Stewart Kaplan

  • February 22, 2024

As a data scientist, staying on top of the latest research in your field is essential.

The data science landscape changes rapidly, and new techniques and tools are constantly being developed.

To keep up with the competition, you need to be aware of the latest trends and topics in data science research.

In this article, we will provide an overview of 37 hot research topics in data science.

We will discuss each topic in detail, including its significance and potential applications.

These topics could be an idea for a thesis or simply topics you can research independently.

Stay tuned – this is one blog post you don’t want to miss!

37 Research Topics in Data Science

1.) predictive modeling.

Predictive modeling is a significant portion of data science and a topic you must be aware of.

Simply put, it is the process of using historical data to build models that can predict future outcomes.

Predictive modeling has many applications, from marketing and sales to financial forecasting and risk management.

As businesses increasingly rely on data to make decisions, predictive modeling is becoming more and more important.

While it can be complex, predictive modeling is a powerful tool that gives businesses a competitive advantage.

predictive modeling

2.) Big Data Analytics

These days, it seems like everyone is talking about big data.

And with good reason – organizations of all sizes are sitting on mountains of data, and they’re increasingly turning to data scientists to help them make sense of it all.

But what exactly is big data? And what does it mean for data science?

Simply put, big data is a term used to describe datasets that are too large and complex for traditional data processing techniques.

Big data typically refers to datasets of a few terabytes or more.

But size isn’t the only defining characteristic – big data is also characterized by its high Velocity (the speed at which data is generated), Variety (the different types of data), and Volume (the amount of the information).

Given the enormity of big data, it’s not surprising that organizations are struggling to make sense of it all.

That’s where data science comes in.

Data scientists use various methods to wrangle big data, including distributed computing and other decentralized technologies.

With the help of data science, organizations are beginning to unlock the hidden value in their big data.

By harnessing the power of big data analytics, they can improve their decision-making, better understand their customers, and develop new products and services.

3.) Auto Machine Learning

Auto machine learning is a research topic in data science concerned with developing algorithms that can automatically learn from data without intervention.

This area of research is vital because it allows data scientists to automate the process of writing code for every dataset.

This allows us to focus on other tasks, such as model selection and validation.

Auto machine learning algorithms can learn from data in a hands-off way for the data scientist – while still providing incredible insights.

This makes them a valuable tool for data scientists who either don’t have the skills to do their own analysis or are struggling.

Auto Machine Learning

4.) Text Mining

Text mining is a research topic in data science that deals with text data extraction.

This area of research is important because it allows us to get as much information as possible from the vast amount of text data available today.

Text mining techniques can extract information from text data, such as keywords, sentiments, and relationships.

This information can be used for various purposes, such as model building and predictive analytics.

5.) Natural Language Processing

Natural language processing is a data science research topic that analyzes human language data.

This area of research is important because it allows us to understand and make sense of the vast amount of text data available today.

Natural language processing techniques can build predictive and interactive models from any language data.

Natural Language processing is pretty broad, and recent advances like GPT-3 have pushed this topic to the forefront.

natural language processing

6.) Recommender Systems

Recommender systems are an exciting topic in data science because they allow us to make better products, services, and content recommendations.

Businesses can better understand their customers and their needs by using recommender systems.

This, in turn, allows them to develop better products and services that meet the needs of their customers.

Recommender systems are also used to recommend content to users.

This can be done on an individual level or at a group level.

Think about Netflix, for example, always knowing what you want to watch!

Recommender systems are a valuable tool for businesses and users alike.

7.) Deep Learning

Deep learning is a research topic in data science that deals with artificial neural networks.

These networks are composed of multiple layers, and each layer is formed from various nodes.

Deep learning networks can learn from data similarly to how humans learn, irrespective of the data distribution.

This makes them a valuable tool for data scientists looking to build models that can learn from data independently.

The deep learning network has become very popular in recent years because of its ability to achieve state-of-the-art results on various tasks.

There seems to be a new SOTA deep learning algorithm research paper on  https://arxiv.org/  every single day!

deep learning

8.) Reinforcement Learning

Reinforcement learning is a research topic in data science that deals with algorithms that can learn on multiple levels from interactions with their environment.

This area of research is essential because it allows us to develop algorithms that can learn non-greedy approaches to decision-making, allowing businesses and companies to win in the long term compared to the short.

9.) Data Visualization

Data visualization is an excellent research topic in data science because it allows us to see our data in a way that is easy to understand.

Data visualization techniques can be used to create charts, graphs, and other visual representations of data.

This allows us to see the patterns and trends hidden in our data.

Data visualization is also used to communicate results to others.

This allows us to share our findings with others in a way that is easy to understand.

There are many ways to contribute to and learn about data visualization.

Some ways include attending conferences, reading papers, and contributing to open-source projects.

data visualization

10.) Predictive Maintenance

Predictive maintenance is a hot topic in data science because it allows us to prevent failures before they happen.

This is done using data analytics to predict when a failure will occur.

This allows us to take corrective action before the failure actually happens.

While this sounds simple, avoiding false positives while keeping recall is challenging and an area wide open for advancement.

11.) Financial Analysis

Financial analysis is an older topic that has been around for a while but is still a great field where contributions can be felt.

Current researchers are focused on analyzing macroeconomic data to make better financial decisions.

This is done by analyzing the data to identify trends and patterns.

Financial analysts can use this information to make informed decisions about where to invest their money.

Financial analysis is also used to predict future economic trends.

This allows businesses and individuals to prepare for potential financial hardships and enable companies to be cash-heavy during good economic conditions.

Overall, financial analysis is a valuable tool for anyone looking to make better financial decisions.

Financial Analysis

12.) Image Recognition

Image recognition is one of the hottest topics in data science because it allows us to identify objects in images.

This is done using artificial intelligence algorithms that can learn from data and understand what objects you’re looking for.

This allows us to build models that can accurately recognize objects in images and video.

This is a valuable tool for businesses and individuals who want to be able to identify objects in images.

Think about security, identification, routing, traffic, etc.

Image Recognition has gained a ton of momentum recently – for a good reason.

13.) Fraud Detection

Fraud detection is a great topic in data science because it allows us to identify fraudulent activity before it happens.

This is done by analyzing data to look for patterns and trends that may be associated with the fraud.

Once our machine learning model recognizes some of these patterns in real time, it immediately detects fraud.

This allows us to take corrective action before the fraud actually happens.

Fraud detection is a valuable tool for anyone who wants to protect themselves from potential fraudulent activity.

fraud detection

14.) Web Scraping

Web scraping is a controversial topic in data science because it allows us to collect data from the web, which is usually data you do not own.

This is done by extracting data from websites using scraping tools that are usually custom-programmed.

This allows us to collect data that would otherwise be inaccessible.

For obvious reasons, web scraping is a unique tool – giving you data your competitors would have no chance of getting.

I think there is an excellent opportunity to create new and innovative ways to make scraping accessible for everyone, not just those who understand Selenium and Beautiful Soup.

15.) Social Media Analysis

Social media analysis is not new; many people have already created exciting and innovative algorithms to study this.

However, it is still a great data science research topic because it allows us to understand how people interact on social media.

This is done by analyzing data from social media platforms to look for insights, bots, and recent societal trends.

Once we understand these practices, we can use this information to improve our marketing efforts.

For example, if we know that a particular demographic prefers a specific type of content, we can create more content that appeals to them.

Social media analysis is also used to understand how people interact with brands on social media.

This allows businesses to understand better what their customers want and need.

Overall, social media analysis is valuable for anyone who wants to improve their marketing efforts or understand how customers interact with brands.

social media

16.) GPU Computing

GPU computing is a fun new research topic in data science because it allows us to process data much faster than traditional CPUs .

Due to how GPUs are made, they’re incredibly proficient at intense matrix operations, outperforming traditional CPUs by very high margins.

While the computation is fast, the coding is still tricky.

There is an excellent research opportunity to bring these innovations to non-traditional modules, allowing data science to take advantage of GPU computing outside of deep learning.

17.) Quantum Computing

Quantum computing is a new research topic in data science and physics because it allows us to process data much faster than traditional computers.

It also opens the door to new types of data.

There are just some problems that can’t be solved utilizing outside of the classical computer.

For example, if you wanted to understand how a single atom moved around, a classical computer couldn’t handle this problem.

You’ll need to utilize a quantum computer to handle quantum mechanics problems.

This may be the “hottest” research topic on the planet right now, with some of the top researchers in computer science and physics worldwide working on it.

You could be too.

quantum computing

18.) Genomics

Genomics may be the only research topic that can compete with quantum computing regarding the “number of top researchers working on it.”

Genomics is a fantastic intersection of data science because it allows us to understand how genes work.

This is done by sequencing the DNA of different organisms to look for insights into our and other species.

Once we understand these patterns, we can use this information to improve our understanding of diseases and create new and innovative treatments for them.

Genomics is also used to study the evolution of different species.

Genomics is the future and a field begging for new and exciting research professionals to take it to the next step.

19.) Location-based services

Location-based services are an old and time-tested research topic in data science.

Since GPS and 4g cell phone reception became a thing, we’ve been trying to stay informed about how humans interact with their environment.

This is done by analyzing data from GPS tracking devices, cell phone towers, and Wi-Fi routers to look for insights into how humans interact.

Once we understand these practices, we can use this information to improve our geotargeting efforts, improve maps, find faster routes, and improve cohesion throughout a community.

Location-based services are used to understand the user, something every business could always use a little bit more of.

While a seemingly “stale” field, location-based services have seen a revival period with self-driving cars.

GPS

20.) Smart City Applications

Smart city applications are all the rage in data science research right now.

By harnessing the power of data, cities can become more efficient and sustainable.

But what exactly are smart city applications?

In short, they are systems that use data to improve city infrastructure and services.

This can include anything from traffic management and energy use to waste management and public safety.

Data is collected from various sources, including sensors, cameras, and social media.

It is then analyzed to identify tendencies and habits.

This information can make predictions about future needs and optimize city resources.

As more and more cities strive to become “smart,” the demand for data scientists with expertise in smart city applications is only growing.

21.) Internet Of Things (IoT)

The Internet of Things, or IoT, is exciting and new data science and sustainability research topic.

IoT is a network of physical objects embedded with sensors and connected to the internet.

These objects can include everything from alarm clocks to refrigerators; they’re all connected to the internet.

That means that they can share data with computers.

And that’s where data science comes in.

Data scientists are using IoT data to learn everything from how people use energy to how traffic flows through a city.

They’re also using IoT data to predict when an appliance will break down or when a road will be congested.

Really, the possibilities are endless.

With such a wide-open field, it’s easy to see why IoT is being researched by some of the top professionals in the world.

internet of things

22.) Cybersecurity

Cybersecurity is a relatively new research topic in data science and in general, but it’s already garnering a lot of attention from businesses and organizations.

After all, with the increasing number of cyber attacks in recent years, it’s clear that we need to find better ways to protect our data.

While most of cybersecurity focuses on infrastructure, data scientists can leverage historical events to find potential exploits to protect their companies.

Sometimes, looking at a problem from a different angle helps, and that’s what data science brings to cybersecurity.

Also, data science can help to develop new security technologies and protocols.

As a result, cybersecurity is a crucial data science research area and one that will only become more important in the years to come.

23.) Blockchain

Blockchain is an incredible new research topic in data science for several reasons.

First, it is a distributed database technology that enables secure, transparent, and tamper-proof transactions.

Did someone say transmitting data?

This makes it an ideal platform for tracking data and transactions in various industries.

Second, blockchain is powered by cryptography, which not only makes it highly secure – but is a familiar foe for data scientists.

Finally, blockchain is still in its early stages of development, so there is much room for research and innovation.

As a result, blockchain is a great new research topic in data science that vows to revolutionize how we store, transmit and manage data.

blockchain

24.) Sustainability

Sustainability is a relatively new research topic in data science, but it is gaining traction quickly.

To keep up with this demand, The Wharton School of the University of Pennsylvania has  started to offer an MBA in Sustainability .

This demand isn’t shocking, and some of the reasons include the following:

Sustainability is an important issue that is relevant to everyone.

Datasets on sustainability are constantly growing and changing, making it an exciting challenge for data scientists.

There hasn’t been a “set way” to approach sustainability from a data perspective, making it an excellent opportunity for interdisciplinary research.

As data science grows, sustainability will likely become an increasingly important research topic.

25.) Educational Data

Education has always been a great topic for research, and with the advent of big data, educational data has become an even richer source of information.

By studying educational data, researchers can gain insights into how students learn, what motivates them, and what barriers these students may face.

Besides, data science can be used to develop educational interventions tailored to individual students’ needs.

Imagine being the researcher that helps that high schooler pass mathematics; what an incredible feeling.

With the increasing availability of educational data, data science has enormous potential to improve the quality of education.

online education

26.) Politics

As data science continues to evolve, so does the scope of its applications.

Originally used primarily for business intelligence and marketing, data science is now applied to various fields, including politics.

By analyzing large data sets, political scientists (data scientists with a cooler name) can gain valuable insights into voting patterns, campaign strategies, and more.

Further, data science can be used to forecast election results and understand the effects of political events on public opinion.

With the wealth of data available, there is no shortage of research opportunities in this field.

As data science evolves, so does our understanding of politics and its role in our world.

27.) Cloud Technologies

Cloud technologies are a great research topic.

It allows for the outsourcing and sharing of computer resources and applications all over the internet.

This lets organizations save money on hardware and maintenance costs while providing employees access to the latest and greatest software and applications.

I believe there is an argument that AWS could be the greatest and most technologically advanced business ever built (Yes, I know it’s only part of the company).

Besides, cloud technologies can help improve team members’ collaboration by allowing them to share files and work on projects together in real-time.

As more businesses adopt cloud technologies, data scientists must stay up-to-date on the latest trends in this area.

By researching cloud technologies, data scientists can help organizations to make the most of this new and exciting technology.

cloud technologies

28.) Robotics

Robotics has recently become a household name, and it’s for a good reason.

First, robotics deals with controlling and planning physical systems, an inherently complex problem.

Second, robotics requires various sensors and actuators to interact with the world, making it an ideal application for machine learning techniques.

Finally, robotics is an interdisciplinary field that draws on various disciplines, such as computer science, mechanical engineering, and electrical engineering.

As a result, robotics is a rich source of research problems for data scientists.

29.) HealthCare

Healthcare is an industry that is ripe for data-driven innovation.

Hospitals, clinics, and health insurance companies generate a tremendous amount of data daily.

This data can be used to improve the quality of care and outcomes for patients.

This is perfect timing, as the healthcare industry is undergoing a significant shift towards value-based care, which means there is a greater need than ever for data-driven decision-making.

As a result, healthcare is an exciting new research topic for data scientists.

There are many different ways in which data can be used to improve healthcare, and there is a ton of room for newcomers to make discoveries.

healthcare

30.) Remote Work

There’s no doubt that remote work is on the rise.

In today’s global economy, more and more businesses are allowing their employees to work from home or anywhere else they can get a stable internet connection.

But what does this mean for data science? Well, for one thing, it opens up a whole new field of research.

For example, how does remote work impact employee productivity?

What are the best ways to manage and collaborate on data science projects when team members are spread across the globe?

And what are the cybersecurity risks associated with working remotely?

These are just a few of the questions that data scientists will be able to answer with further research.

So if you’re looking for a new topic to sink your teeth into, remote work in data science is a great option.

31.) Data-Driven Journalism

Data-driven journalism is an exciting new field of research that combines the best of both worlds: the rigor of data science with the creativity of journalism.

By applying data analytics to large datasets, journalists can uncover stories that would otherwise be hidden.

And telling these stories compellingly can help people better understand the world around them.

Data-driven journalism is still in its infancy, but it has already had a major impact on how news is reported.

In the future, it will only become more important as data becomes increasingly fluid among journalists.

It is an exciting new topic and research field for data scientists to explore.

journalism

32.) Data Engineering

Data engineering is a staple in data science, focusing on efficiently managing data.

Data engineers are responsible for developing and maintaining the systems that collect, process, and store data.

In recent years, there has been an increasing demand for data engineers as the volume of data generated by businesses and organizations has grown exponentially.

Data engineers must be able to design and implement efficient data-processing pipelines and have the skills to optimize and troubleshoot existing systems.

If you are looking for a challenging research topic that would immediately impact you worldwide, then improving or innovating a new approach in data engineering would be a good start.

33.) Data Curation

Data curation has been a hot topic in the data science community for some time now.

Curating data involves organizing, managing, and preserving data so researchers can use it.

Data curation can help to ensure that data is accurate, reliable, and accessible.

It can also help to prevent research duplication and to facilitate the sharing of data between researchers.

Data curation is a vital part of data science. In recent years, there has been an increasing focus on data curation, as it has become clear that it is essential for ensuring data quality.

As a result, data curation is now a major research topic in data science.

There are numerous books and articles on the subject, and many universities offer courses on data curation.

Data curation is an integral part of data science and will only become more important in the future.

businessman

34.) Meta-Learning

Meta-learning is gaining a ton of steam in data science. It’s learning how to learn.

So, if you can learn how to learn, you can learn anything much faster.

Meta-learning is mainly used in deep learning, as applications outside of this are generally pretty hard.

In deep learning, many parameters need to be tuned for a good model, and there’s usually a lot of data.

You can save time and effort if you can automatically and quickly do this tuning.

In machine learning, meta-learning can improve models’ performance by sharing knowledge between different models.

For example, if you have a bunch of different models that all solve the same problem, then you can use meta-learning to share the knowledge between them to improve the cluster (groups) overall performance.

I don’t know how anyone looking for a research topic could stay away from this field; it’s what the  Terminator  warned us about!

35.) Data Warehousing

A data warehouse is a system used for data analysis and reporting.

It is a central data repository created by combining data from multiple sources.

Data warehouses are often used to store historical data, such as sales data, financial data, and customer data.

This data type can be used to create reports and perform statistical analysis.

Data warehouses also store data that the organization is not currently using.

This type of data can be used for future research projects.

Data warehousing is an incredible research topic in data science because it offers a variety of benefits.

Data warehouses help organizations to save time and money by reducing the need for manual data entry.

They also help to improve the accuracy of reports and provide a complete picture of the organization’s performance.

Data warehousing feels like one of the weakest parts of the Data Science Technology Stack; if you want a research topic that could have a monumental impact – data warehousing is an excellent place to look.

data warehousing

36.) Business Intelligence

Business intelligence aims to collect, process, and analyze data to help businesses make better decisions.

Business intelligence can improve marketing, sales, customer service, and operations.

It can also be used to identify new business opportunities and track competition.

BI is business and another tool in your company’s toolbox to continue dominating your area.

Data science is the perfect tool for business intelligence because it combines statistics, computer science, and machine learning.

Data scientists can use business intelligence to answer questions like, “What are our customers buying?” or “What are our competitors doing?” or “How can we increase sales?”

Business intelligence is a great way to improve your business’s bottom line and an excellent opportunity to dive deep into a well-respected research topic.

37.) Crowdsourcing

One of the newest areas of research in data science is crowdsourcing.

Crowdsourcing is a process of sourcing tasks or projects to a large group of people, typically via the internet.

This can be done for various purposes, such as gathering data, developing new algorithms, or even just for fun (think: online quizzes and surveys).

But what makes crowdsourcing so powerful is that it allows businesses and organizations to tap into a vast pool of talent and resources they wouldn’t otherwise have access to.

And with the rise of social media, it’s easier than ever to connect with potential crowdsource workers worldwide.

Imagine if you could effect that, finding innovative ways to improve how people work together.

That would have a huge effect.

crowd sourcing

Final Thoughts, Are These Research Topics In Data Science For You?

Thirty-seven different research topics in data science are a lot to take in, but we hope you found a research topic that interests you.

If not, don’t worry – there are plenty of other great topics to explore.

The important thing is to get started with your research and find ways to apply what you learn to real-world problems.

We wish you the best of luck as you begin your data science journey!

Other Data Science Articles

We love talking about data science; here are a couple of our favorite articles:

  • Why Are You Interested In Data Science?
  • Recent Posts

Stewart Kaplan

  • Top ChatGPT Alternatives for Essays You Need to Know [Uncover the Best Picks] - May 3, 2024
  • Are Software Engineers Richer than Doctors? The Surprising Truth Revealed [Must-Read] - May 2, 2024
  • How Does MATLAB Work: Advanced Features & Capabilities [Unlock New Possibilities] - May 2, 2024

Trending now

Multivariate Polynomial Regression Python

Mon - Sat 9:00am - 12:00am

  • Get a quote

List of Best Research and Thesis Topic Ideas for Data Science in 2022

In an era driven by digital and technological transformation, businesses actively seek skilled and talented data science potentials capable of leveraging data insights to enhance business productivity and achieve organizational objectives. In keeping with an increasing demand for data science professionals, universities offer various data science and big data courses to prepare students for the tech industry. Research projects are a crucial part of these programs and a well- executed data science project can make your CV appear more robust and compelling. A  broad range of data science topics exist that offer exciting possibilities for research but choosing data science research topics can be a real challenge for students . After all, a good research project relies first and foremost on data analytics research topics that draw upon both mono-disciplinary and multi-disciplinary research to explore endless possibilities for real –world applications.

As one of the top-most masters and PhD online dissertation writing services , we are geared to assist students in the entire research process right from the initial conception to the final execution to ensure that you have a truly fulfilling and enriching research experience. These resources are also helpful for those students who are taking online classes .

By taking advantage of our best digital marketing research topics in data science you can be assured of producing an innovative research project that will impress your research professors and make a huge difference in attracting the right employers.

Get an Immediate Response

Discuss your requirments with our writers

Get 3 Customize Research Topic within 24 Hours

Undergraduate Masters PhD Others

Data science thesis topics

We have compiled a list of data science research topics for students studying data science that can be utilized in data science projects in 2022. our team of professional data experts have brought together master or MBA thesis topics in data science  that cater to core areas  driving the field of data science and big data that will relieve all your research anxieties and  provide a solid grounding for  an interesting research projects . The article will feature data science thesis ideas that can be immensely beneficial for students as they cover a broad research agenda for future data science . These ideas have been drawn from the 8 v’s of big data namely Volume, Value, Veracity, Visualization, Variety, Velocity, Viscosity, and Virility that provide interesting and challenging research areas for prospective researches  in their masters or PhD thesis . Overall, the general big data research topics can be divided into distinct categories to facilitate the research topic selection process.

  • Security and privacy issues
  • Cloud Computing Platforms for Big Data Adoption and Analytics
  • Real-time data analytics for processing of image , video and text
  • Modeling uncertainty

How “The Research Guardian” Can Help You A lot!

Our top thesis writing experts are available 24/7 to assist you the right university projects. Whether its critical literature reviews to complete your PhD. or Master Levels thesis.

DATA SCIENCE PHD RESEARCH TOPICS

The article will also guide students engaged in doctoral research by introducing them to an outstanding list of data science thesis topics that can lead to major real-time applications of big data analytics in your research projects.

  • Intelligent traffic control ; Gathering and monitoring traffic information using CCTV images.
  • Asymmetric protected storage methodology over multi-cloud service providers in Big data.
  • Leveraging disseminated data over big data analytics environment.
  • Internet of Things.
  • Large-scale data system and anomaly detection.

What makes us a unique research service for your research needs?

We offer all –round and superb research services that have a distinguished track record in helping students secure their desired grades in research projects in big data analytics and hence pave the way for a promising career ahead. These are the features that set us apart in the market for research services that effectively deal with all significant issues in your research for.

  • Plagiarism –free ; We strictly adhere to a non-plagiarism policy in all our research work to  provide you with well-written, original content  with low similarity index   to maximize  chances of acceptance of your research submissions.
  • Publication; We don’t just suggest PhD data science research topics but our PhD consultancy services take your research to the next level by ensuring its publication in well-reputed journals. A PhD thesis is indispensable for a PhD degree and with our premier best PhD thesis services that  tackle all aspects  of research writing and cater to  essential requirements of journals , we will bring you closer to your dream of being a PhD in the field of data analytics.
  • Research ethics: Solid research ethics lie at the core of our services where we actively seek to protect the  privacy and confidentiality of  the technical and personal information of our valued customers.
  • Research experience: We take pride in our world –class team of computing industry professionals equipped with the expertise and experience to assist in choosing data science research topics and subsequent phases in research including findings solutions, code development and final manuscript writing.
  • Business ethics: We are driven by a business philosophy that‘s wholly committed to achieving total customer satisfaction by providing constant online and offline support and timely submissions so that you can keep track of the progress of your research.

Now, we’ll proceed to cover specific research problems encompassing both data analytics research topics and big data thesis topics that have applications across multiple domains.

Get Help from Expert Thesis Writers!

TheresearchGuardian.com providing expert thesis assistance for university students at any sort of level. Our thesis writing service has been serving students since 2011.

Multi-modal Transfer Learning for Cross-Modal Information Retrieval

Aim and objectives.

The research aims to examine and explore the use of CMR approach in bringing about a flexible retrieval experience by combining data across different modalities to ensure abundant multimedia data.

  • Develop methods to enable learning across different modalities in shared cross modal spaces comprising texts and images as well as consider the limitations of existing cross –modal retrieval algorithms.
  • Investigate the presence and effects of bias in cross modal transfer learning and suggesting strategies for bias detection and mitigation.
  • Develop a tool with query expansion and relevance feedback capabilities to facilitate search and retrieval of multi-modal data.
  • Investigate the methods of multi modal learning and elaborate on the importance of multi-modal deep learning to provide a comprehensive learning experience.

The Role of Machine Learning in Facilitating the Implication of the Scientific Computing and Software Engineering

  • Evaluate how machine learning leads to improvements in computational APA reference generator tools and thus aids in  the implementation of scientific computing
  • Evaluating the effectiveness of machine learning in solving complex problems and improving the efficiency of scientific computing and software engineering processes.
  • Assessing the potential benefits and challenges of using machine learning in these fields, including factors such as cost, accuracy, and scalability.
  • Examining the ethical and social implications of using machine learning in scientific computing and software engineering, such as issues related to bias, transparency, and accountability.

Trustworthy AI

The research aims to explore the crucial role of data science in advancing scientific goals and solving problems as well as the implications involved in use of AI systems especially with respect to ethical concerns.

  • Investigate the value of digital infrastructures  available through open data   in  aiding sharing  and inter linking of data for enhanced global collaborative research efforts
  • Provide explanations of the outcomes of a machine learning model  for a meaningful interpretation to build trust among users about the reliability and authenticity of data
  • Investigate how formal models can be used to verify and establish the efficacy of the results derived from probabilistic model.
  • Review the concept of Trustworthy computing as a relevant framework for addressing the ethical concerns associated with AI systems.

The Implementation of Data Science and their impact on the management environment and sustainability

The aim of the research is to demonstrate how data science and analytics can be leveraged in achieving sustainable development.

  • To examine the implementation of data science using data-driven decision-making tools
  • To evaluate the impact of modern information technology on management environment and sustainability.
  • To examine the use of  data science in achieving more effective and efficient environment management
  • Explore how data science and analytics can be used to achieve sustainability goals across three dimensions of economic, social and environmental.

Big data analytics in healthcare systems

The aim of the research is to examine the application of creating smart healthcare systems and   how it can   lead to more efficient, accessible and cost –effective health care.

  • Identify the potential Areas or opportunities in big data to transform the healthcare system such as for diagnosis, treatment planning, or drug development.
  • Assessing the potential benefits and challenges of using AI and deep learning in healthcare, including factors such as cost, efficiency, and accessibility
  • Evaluating the effectiveness of AI and deep learning in improving patient outcomes, such as reducing morbidity and mortality rates, improving accuracy and speed of diagnoses, or reducing medical errors
  • Examining the ethical and social implications of using AI and deep learning in healthcare, such as issues related to bias, privacy, and autonomy.

Large-Scale Data-Driven Financial Risk Assessment

The research aims to explore the possibility offered by big data in a consistent and real time assessment of financial risks.

  • Investigate how the use of big data can help to identify and forecast risks that can harm a business.
  • Categories the types of financial risks faced by companies.
  • Describe the importance of financial risk management for companies in business terms.
  • Train a machine learning model to classify transactions as fraudulent or genuine.

Scalable Architectures for Parallel Data Processing

Big data has exposed us to an ever –growing volume of data which cannot be handled through traditional data management and analysis systems. This has given rise to the use of scalable system architectures to efficiently process big data and exploit its true value. The research aims to analyses the current state of practice in scalable architectures and identify common patterns and techniques to design scalable architectures for parallel data processing.

  • To design and implement a prototype scalable architecture for parallel data processing
  • To evaluate the performance and scalability of the prototype architecture using benchmarks and real-world datasets
  • To compare the prototype architecture with existing solutions and identify its strengths and weaknesses
  • To evaluate the trade-offs and limitations of different scalable architectures for parallel data processing
  • To provide recommendations for the use of the prototype architecture in different scenarios, such as batch processing, stream processing, and interactive querying

Robotic manipulation modelling

The aim of this research is to develop and validate a model-based control approach for robotic manipulation of small, precise objects.

  • Develop a mathematical model of the robotic system that captures the dynamics of the manipulator and the grasped object.
  • Design a control algorithm that uses the developed model to achieve stable and accurate grasping of the object.
  • Test the proposed approach in simulation and validate the results through experiments with a physical robotic system.
  • Evaluate the performance of the proposed approach in terms of stability, accuracy, and robustness to uncertainties and perturbations.
  • Identify potential applications and areas for future work in the field of robotic manipulation for precision tasks.

Big data analytics and its impacts on marketing strategy

The aim of this research is to investigate the impact of big data analytics on marketing strategy and to identify best practices for leveraging this technology to inform decision-making.

  • Review the literature on big data analytics and marketing strategy to identify key trends and challenges
  • Conduct a case study analysis of companies that have successfully integrated big data analytics into their marketing strategies
  • Identify the key factors that contribute to the effectiveness of big data analytics in marketing decision-making
  • Develop a framework for integrating big data analytics into marketing strategy.
  • Investigate the ethical implications of big data analytics in marketing and suggest best practices for responsible use of this technology.

Looking For Customize Thesis Topics?

Take a review of different varieties of thesis topics and samples from our website TheResearchGuardian.com on multiple subjects for every educational level.

Platforms for large scale data computing: big data analysis and acceptance

To investigate the performance and scalability of different large-scale data computing platforms.

  • To compare the features and capabilities of different platforms and determine which is most suitable for a given use case.
  • To identify best practices for using these platforms, including considerations for data management, security, and cost.
  • To explore the potential for integrating these platforms with other technologies and tools for data analysis and visualization.
  • To develop case studies or practical examples of how these platforms have been used to solve real-world data analysis challenges.

Distributed data clustering

Distributed data clustering can be a useful approach for analyzing and understanding complex datasets, as it allows for the identification of patterns and relationships that may not be immediately apparent.

To develop and evaluate new algorithms for distributed data clustering that is efficient and scalable.

  • To compare the performance and accuracy of different distributed data clustering algorithms on a variety of datasets.
  • To investigate the impact of different parameters and settings on the performance of distributed data clustering algorithms.
  • To explore the potential for integrating distributed data clustering with other machine learning and data analysis techniques.
  • To apply distributed data clustering to real-world problems and evaluate its effectiveness.

Analyzing and predicting urbanization patterns using GIS and data mining techniques".

The aim of this project is to use GIS and data mining techniques to analyze and predict urbanization patterns in a specific region.

  • To collect and process relevant data on urbanization patterns, including population density, land use, and infrastructure development, using GIS tools.
  • To apply data mining techniques, such as clustering and regression analysis, to identify trends and patterns in the data.
  • To use the results of the data analysis to develop a predictive model for urbanization patterns in the region.
  • To present the results of the analysis and the predictive model in a clear and visually appealing way, using GIS maps and other visualization techniques.

Use of big data and IOT in the media industry

Big data and the Internet of Things (IoT) are emerging technologies that are transforming the way that information is collected, analyzed, and disseminated in the media sector. The aim of the research is to understand how big data and IoT re used to dictate information flow in the media industry

  • Identifying the key ways in which big data and IoT are being used in the media sector, such as for content creation, audience engagement, or advertising.
  • Analyzing the benefits and challenges of using big data and IoT in the media industry, including factors such as cost, efficiency, and effectiveness.
  • Examining the ethical and social implications of using big data and IoT in the media sector, including issues such as privacy, security, and bias.
  • Determining the potential impact of big data and IoT on the media landscape and the role of traditional media in an increasingly digital world.

Exigency computer systems for meteorology and disaster prevention

The research aims to explore the role of exigency computer systems to detect weather and other hazards for disaster prevention and response

  • Identifying the key components and features of exigency computer systems for meteorology and disaster prevention, such as data sources, analytics tools, and communication channels.
  • Evaluating the effectiveness of exigency computer systems in providing accurate and timely information about weather and other hazards.
  • Assessing the impact of exigency computer systems on the ability of decision makers to prepare for and respond to disasters.
  • Examining the challenges and limitations of using exigency computer systems, such as the need for reliable data sources, the complexity of the systems, or the potential for human error.

Network security and cryptography

Overall, the goal of research is to improve our understanding of how to protect communication and information in the digital age, and to develop practical solutions for addressing the complex and evolving security challenges faced by individuals, organizations, and societies.

  • Developing new algorithms and protocols for securing communication over networks, such as for data confidentiality, data integrity, and authentication
  • Investigating the security of existing cryptographic primitives, such as encryption and hashing algorithms, and identifying vulnerabilities that could be exploited by attackers.
  • Evaluating the effectiveness of different network security technologies and protocols, such as firewalls, intrusion detection systems, and virtual private networks (VPNs), in protecting against different types of attacks.
  • Exploring the use of cryptography in emerging areas, such as cloud computing, the Internet of Things (IoT), and blockchain, and identifying the unique security challenges and opportunities presented by these domains.
  • Investigating the trade-offs between security and other factors, such as performance, usability, and cost, and developing strategies for balancing these conflicting priorities.

Meet Our Professionals Ranging From Renowned Universities

Related topics.

  • Sports Management Research Topics
  • Special Education Research Topics
  • Software Engineering Research Topics
  • Primary Education Research Topics
  • Microbiology Research Topics
  • Luxury Brand Research Topics
  • Cyber Security Research Topics
  • Commercial Law Research Topics
  • Change Management Research Topics
  • Artificial intelligence Research Topics
  • Reference Manager
  • Simple TEXT file

People also looked at

Review article, the applicability of big data in climate change research: the importance of system of systems thinking.

thesis topics about big data

  • 1 MTA-PE “Lendület” Complex Systems Monitoring Research Group, University of Pannonia, Veszprém, Hungary
  • 2 Sustainability Solutions Research Lab, University of Pannonia, Veszprém, Hungary

The aim of this paper is to provide an overview of the interrelationship between data science and climate studies, as well as describes how sustainability climate issues can be managed using the Big Data tools. Climate-related Big Data articles are analyzed and categorized, which revealed the increasing number of applications of data-driven solutions in specific areas, however, broad integrative analyses are gaining less of a focus. Our major objective is to highlight the potential in the System of Systems (SoS) theorem, as the synergies between diverse disciplines and research ideas must be explored to gain a comprehensive overview of the issue. Data and systems science enables a large amount of heterogeneous data to be integrated and simulation models developed, while considering socio-environmental interrelations in parallel. The improved knowledge integration offered by the System of Systems thinking or climate computing has been demonstrated by analysing the possible inter-linkages of the latest Big Data application papers. The analysis highlights how data and models focusing on the specific areas of sustainability can be bridged to study the complex problems of climate change.

1. Introduction

Climate change is a pressing issue of today, for which data-based models and decision support techniques offer a more comprehensive understanding of its complexity. The aim of this paper is to reveal data-based techniques and their applicability in terms of climate researches. More precisely, how can Big Data, through data science answer sustainability climate issues and be applicable in scientific researches and decision sciences in an integrated manner.

The overview is guided through three closely related notions, namely, (1) data science as a novel interdisciplinary field connected to (2) machine learning that is a tool for improving automatic prediction or decision processes, and (3) Big Data which foster processing and connecting large amount of heterogeneous data. The focus point of this research is the interconnectedness of the complex climate-related systems, for which exploration Big Data provides an efficient toolbox.

Research questions formulated three aspects, which answering kept in focus through the whole paper:

• How and when Big Data appears in climate-related studies?

• What researches have been made in regard with Big Data applications in climate studies, and how they are structured?

• How to integrate the knowledge accumulated in diverse specific researches?

The year 2015 brought about further excitement in the field of research directions concerning climate change, as the United Nations declared 17 sustainable development goals, of which SDG13 is “Take urgent action to combat climate and its impacts” ( UN, 2016 ) and the Paris Agreement has been signed, that concerning the mitigation of greenhouse gas emissions, adaptation and finance in 2015 with the specific aim of keeping global average temperature rises well below 2°C above pre-industrial levels and then continuing efforts to keep global temperature rises below 1.5°C above pre-industrial levels, recognizing that this will significantly reduce the risks and impacts of climate change ( Rogelj et al., 2016 ). This kind of organizing principle supports the complex analysis of the classical disciplinary sciences with a holistic, interdisciplinary approach. New types of approaches require much more complex analyses and models and, therefore, several orders of magnitude more data, which brought Big Data to life as a stand-alone scientific discipline.

Big Data-based tools are already widespread in this new complex science, for example, to monitor seasonal changes in climate change ( Manogaran et al., 2018 ), understand climate change as a theory-guided data science paradigm ( Faghmous et al., 2014 ), learn how to manage the risks of climate change ( Ford et al., 2016 ), explore soft data sources, e.g., Twitter ( Jang et al., 2015 ), or demonstrate the potential of Systems of Systems (SoS), for instance, the exploration of the structure and relationships across institutions and disciplines of a global Big Earth Data cyber-infrastructure: the Global Earth Observation System of Systems (GEOSS) ( Craglia et al., 2017 ).

Today, it is obvious that sustainability science is intertwined with data science, however, with the support of the business model of the circular economy ( Jabbour et al., 2019 ), the complexity of the problem repository has further increased, so there is an urgent need to include data and analysis methods in the framework, whereas research results from different fields can be used in other fields. Furthermore, trends in climate and sustainability science are driving models toward higher resolution, greater complexity, and larger ensembles, which calls for multidisciplinary approaches in climate computational sciences ( Balaji, 2015 ). This research provides a higher-level overview of the interconnectedness of disciplines, systems, data, and tools related to climate change, exploring further focal points concerning the need a deeper level of integration, because a disconnection between important industry initiatives and scientific research is still experienced ( Nobre and Tavares, 2017 ). We propose to solve these integration tasks and disconnections by the System of Systems thinking.

This overview seeks to address these shortcomings. Information sources (data, news, scientific databases) can be linked, drawing attention to the future importance of open linked data. The present research draws attention to System of Systems (SoS) thinking, as the drivers and effects of climate change, as well as resilience and adaptation, can only be achieved through the timely recognition and exploitation of synergies and trade-offs between the new research directions.

The research methodology outlines firstly, the identification of sustainability science problems in section 2, which revealed the connected issues and tasks as well as the requirements needed to succeed. It ensured that sustainable operation of nature and society demands the approach of systems of system along with the integration of Big Data applications into climate-related scientific, societal, and political researches. This is in line with the growing risk of uncertainty zones highlighted in the planetary boundary framework ( Steffen et al., 2015 ). Then, the existing applications of the related data analysis in the field was explored. For a deeper and narrowed insight, literature review was based on the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) method, which contributes to the exploration and evaluation of related articles. The search has a clear and narrowed focus on the multidisciplinary nature of the issue, therefore the generic evaluation is not in purpose. Fifty-seven review articles were individually analyzed to identify focus areas and research gaps in the Big Data applications in climate change researches. Systematic meta-analysis was used to identify how data are clustering into diverse focus ares and to extract valuable structural information. The co-occurrences of keywords were examined with regard to 442 articles describing the relationship between climate change and Big Data.

In the following sections, the aforementioned research questions are being unfolded and answered through revealing the increasing importance of the System of Systems theorem. Synergies between new research directions and disciplines must be explored to determine the drivers and effects of climate issues as well as provide an efficient strategic adaptation and mitigation plan that also consider socio-environmental factors. Our proposed SoS framework is a response to this integrated knowledge management, as a first step toward climate computing.

In section 2, the sustainability science theorem questions are answered considering the essential need of data science applications. In section 3, heterogeneous data management as well as Big Data tools and techniques are emphasized.

The systematic review of climate change analyses can be found in section 4, which includes the connections between Big Data and climate in section 4.1 as well as a critical summary of different methods in section 4.2. The social aspects are highlighted in section 4.3. Based on the overview, from the new climate-related research findings, a specific SoS framework is presented in section 4.4 and the intertwining of the SoS and SDGs are discussed in section 5, where the suggestions for future research directions and applications are summarized.

2. Problems of Sustainability Science

The complexity of climate issues requires adaptive strategies for public policy ( Di Gregorio et al., 2019 ), actions to incite social behavior ( Xie B. et al., 2019 ), and the development of regulatory and market-simulating responses to economic life ( Wright and Nyberg, 2017 ). To meet this complex societal need, research has focused on understanding the causes of climate change ( Hegerl et al., 2019 ), the development of predictive models ( Du et al., 2019 ), and mitigation solutions ( Gomez-Zavaglia et al., 2020 ), as well as the exploration of opportunities to shape social attitudes ( Iturriza et al., 2020 ).

An interdisciplinary approach is essential in terms of the identification of almost every climate-related problem and development of their solutions. This interdisciplinary perspective has formed sustainability science theorem to gain a comprehensive understanding of the interrelationship between environment and society ( Kates et al., 2001 ). This theory focuses on transdisciplinary questions, which can only be answered by applying of data science tools.

• How can the dynamic relationship between nature and society be described and analyzed?

Systems Dynamics Modeling tends to be a commonly used tool when describing and analysing the dynamic interrelation of environment, economy, and society ( Honti and Abonyi, 2019 ). This concept is clearly characterized by the World3 model, which describes the relationship between population, industrial growth, food production, and ecosystem constraints over time for the Club of Rome in the book entitled “The Limits to Growth” ( Meadows et al., 1972 ). The exploration of the relationship between the state variables of the model requires targeted interdisciplinary research. The tools of data science can render this research more efficient with the automated generation and validation of relationship hypotheses ( Sebestyén et al., 2019 ), as data-based models beyond the exploration of probabilistic correlations can provide information on causation ( Dörgő et al., 2018 ). One of the most significant tasks for the more in depth analysis of climate effects is the integration and joint management of heterogeneous data and information. The proof of this potential approach is a case study that interlinks socio-economic variables to explore the effect of the climate on global food production systems ( Fischer et al., 2005 ).

• How can delays, inertia, and uncertainty in models be handled?

To quantify the impact of uncertainties inherent in climate variables, the evaluation of Representative Concentration Pathways RCP 4.5 and RCP 8.5 CMIP models developed to forecast climate change ( Taylor et al., 2012 ; Eyring et al., 2016 ), by using Monte Carlo simulations can be suitable ( Mallick et al., 2018 ). The most important task ahead is the integrated development of targeted solutions for designing, evaluating and integrating simulation studies to quantify uncertainty and risk in the light of environmental and social data ( Climate Change, 2014 ). For this reason DKRZ carried out extensive simulations with the Earth system model MPI-ESM with respect to the CMIP5 project and the IPCC AR5, presenting a selection of visualizations for different key climate variables and for the different scenarios ( Klimarechenzentrum, 2021 ).

• How can the features concerning the vulnerability of socio-environmental systems be explored?

The conceptual framework of vulnerability is grounded by the Intergovernmental Panel on Climate Change (IPCC). The complex impact chains of vulnerability demand the identification and integration of non-climatic factors into climate models, in addition the development of models describing adaptability as well as the estimation of expected damage ( Füssel and Klein, 2006 ). It is believed that the toolbox of network science will play an increasing role in evaluating vulnerability as the significance of state variables and their relationships can be directly qualified regarding their role in dynamic models ( Leitold et al., 2020 ).

• How can the increasing risk be measured? What scientifically based “boundaries” and “limits” can be defined?

The purpose of the planetary boundaries concept is to define operating conditions and to account for adverse or catastrophic abrupt environmental changes in the crossing of one or more planetary boundaries ( Rockström et al., 2009 ). Quantifying the risks of climate-induced changes using climate models shows that the risks will increase over the next 200 years, even if the composition of the atmosphere remains constant ( Scholze et al., 2006 ). The socio-cultural domain plays a crucial role in terms of risk perception ( Van der Linden, 2015 ), therefore, the integration of variables describing socio-cultural factors into the models can be particularly important. Analyses are essential to explore how human-induced perturbations affect the delicate balance of the ecosystem in addition to determining where the limits and boundaries are, the crossing of which would pose an unacceptable level of risk ( Steffen et al., 2015 ). The integrated application of simulation tools and machine learning toolbox can efficiently explore these boundaries ( Lenton, 2011 ).

• What support/motivation systems can be developed—rules, norms, scientific information—to increase the capacity and sustainability of society? What signs and guidelines are needed to put society on a sustainable path? How can today's isolated research, analyses, and decision support systems be integrated more efficiently?

The integration and targeted systematization of scientific knowledge is needed to address the long-term causes of climate change and reduce its effects ( Pauliuk, 2020 ). Research concerning sustainability and socio-ecological systems has been partly interlinked to foster sustainability transformation in a transdisciplinary manner. For bridging the gap between science and society, the involvement of citizens in framing research and processes may be a solution as “through their relationship to a place, bounded often as a social-ecological construct, stakeholders, and people at large play an essential role in sustainability transformation research.” Furthermore, the involvement of external parties can support research into socio-ecological systems and sustainability science ( Horcea-Milcu et al., 2020 ). Methods of the co-production of knowledge, e.g., triangulation, the Multiple Evidence Based approach and scenario building, by learning about cross-border engagement, help to ensure that transdisciplinarity is not only a precursor of integration ( Klenk and Meehan, 2015 ).

To follow the aforementioned path toward sustainable dynamics of nature and society, the data science toolbox and models must be integrated into climate change-related scientific and societal research as well as political agenda. In the following, the Big Data tools and management are interpreted with a specific focus on their role in climate change and we build a System of Systems (climate computing) framework from the various applications.

3. Data Analysis Tasks of Climate Change Researches

The term Big Data has spread due to new technologies and innovations that have emerged over the past decade ( Chen and Chiang, 2012 ) given the demand for the analysis of large amounts of and rapidly generated diverse data, therefore, collection and processing takes place at a high speed, which is difficult to implement with calcareous analytical tools ( Constantiou and Kallinikos, 2015 ). The explosive leap in the amount of data has also infiltrated health, finance, and education ( Benjelloun et al., 2015 ). With regard to the global economy, Big Data is key to understanding and increasing performance ( Maria et al., 2015 ). Big Data is also gaining ground in the field of sustainability, so it can be used to improve social and environmental sustainability in supply chains ( Dubey et al., 2019 ), augment the informational landscape of smart sustainable cities ( Bibri, 2018 ), and improve the allocation and utilization of natural resources ( Song et al., 2017 ) as well as supply chain sustainability ( Hazen et al., 2016 ).

Big and open data from “smart” government to transformational government can facilitate collaboration. It is possible to introduce real-time solutions into agriculture, health, transport, and other challenges ( Bertot et al., 2014 ). The Big Data approach can be the most effective tool to improve mutual governmental and civic understanding, thus embodying the principles of digital governance as the most viable public management model ( Clarke and Margetts, 2014 ). There is a need to collect large amounts of data that can be used to model and test different scenarios to sustainably transform energy production and consumption, improve food and water security, as well as eradicate poverty. Initiatives such as the Intergovernmental Panel on Climate Change and the Global Ocean Observing System can fill gaps in scientific, technical and socio-economic data ( Gijzen, 2013 ). The analysis of sustainable business performance forecasts through the analysis of Big Data in the context of developing countries shows that “Management and leadership style” and “Government policy” are the most significant factors at present ( Raut et al., 2019 ).

The process of data mining is shown in Figure 1 .

www.frontiersin.org

Figure 1 . The process of data mining.

Big Data is a rapidly generated amount of information from a variety of sources and in a different format. Data analysis is the examination and transformation of raw data into interpretable information, while data science is a multidisciplinary field of various analyses, programming tools, and algorithms, forecasting analysis statistics as well as machine learning that aims to recognize and extract patterns in raw data. Thus, Big Data primarily looks at ways to analyse, systematically extract or otherwise handle data from datasets that are too large or complex to handle with traditional data processing application software that requires significant scaling (multiple nodes) to process efficiently. In other words, Big Data can be defined by the 5V key characteristics, i.e., volume, velocity, variety, veracity, and value ( Laney, 2001 ).

The storage, sustainability, and analysis of massive content is a challenge that the current state of algorithms and systems cannot handle ( Trifu and Ivan, 2014 ) in an integrated manner, therefore the synergies of the different sources are not sufficiently exploited. The purpose of using Big Data is to provide data management and analysis tools for the ever-increasing amount of data ( Anuradha et al., 2015 ). As is shown in Figure 2 , data analysis can be divided into four general categories ( Erl et al., 2016 ). In the environments of Big Data analytics, data analytics involves the use of highly scalable distributed frameworks and technologies to extract meaningful information from large amounts of raw data that requires the use of different data analysis methods ( Rajaraman, 2016 ).

www.frontiersin.org

Figure 2 . The types of data analytics.

Big Data is usually associated with two technologies, cloud computing and the Internet of Things (IoT) ( Honti and Abonyi, 2019 ). Cloud computing accelerates unlimited data storage, parallel data processing, and analysis ( Inukollu et al., 2014 ). The key benefits of cloud computing are improved analysis, simplified infrastructure, and cost reduction. IoT offers the ability to connect computing devices, mechanical and digital machines as well as objects and people ( Lavin et al., 2015 ). With the advent of the IoT, huge amounts of data can be collected using smart devices connected via the Internet ( Suchetha et al., 2015 ).

The applicability of Big Data techniques is also significantly enhanced by the novel tools that support data collection and integration. The interoperability of the systems can be improved by data warehouses and the related ETL (extract, transform, load) functionalities that can also be used to gather information from multiple models and data sources. The benefit of these structure are demonstrated in the EC4MACS (European Consortium for Modeling of Air Pollution and Climate Strategies) data warehouse that establishes a suite of modeling tools for a comprehensive integrated assessment of the effectiveness of emission control strategies for air pollutants and greenhouse gases. In this system the integrated data are loaded into the GAINS (Greenhouse gas-Air pollution Interactions and Synergies) Data Warehouse. This assessment brought together expert knowledge in the fields of energy, transport, agriculture, forestry, land use, atmospheric dispersion, health and vegetation impacts, and it developed a coherent outlook into the future options to reduce atmospheric pollution in Europe ( Nguyen et al., 2012 ).

The integration of different information can also be supported by ontology-based linked data. Ontology Web Language (OWL) models enables the semantic characterization of the different events that can describe the climate change story from multiple perspectives, including scientific, social, political, and technological ones ( Pileggi et al., 2020 ).

Artificial intelligence (AI) and machine learning (ML) are also the key enabler technologies of big data analysis. This paper focuses on the applicability of ML-based models. AI is mainly used to support decision-making, but it also can skilfully fill observational gaps when combined with numerical climate model data. An example of this application can be found in the extension of historical temperature measurements used in global climate datasets like HadCRUT4 ( Kadow et al., 2020 ).

Analysis of Big Data combines traditional methods of statistical analysis with computational approaches. Based on the complexity between the variables and the type of results required, data analysis can be a simple data set query or a combination of sophisticated analysis techniques ( Al-Shiakhli, 2019 ). The analysis of Big Data is a synthesis of quantitative and qualitative analyses. Climate computing combines multidisciplinary researches in regard to climatic, data and system sciences to efficiently capture and analyse climate-related Big Data as well as to support socio-environmental efforts. Underlying this aspect, a complex model of the earth system is continuously developed by DKRZ using supercomputers relying on Big Data, numerical computations, and simulation models to enable scientists to integrate chemical and biological processes, as well as investigate the interaction of the climate and the socio-economic system ( Klimarechenzentrum, 2021 ).

Exploratory Data Analysis (EDA) techniques are approaches for analysing large data sets. These techniques make the main features clearer by hiding other aspects. Most EDA techniques are graphical in nature, with some non-graphical additions. Some basic EDA tools are histograms, quantile quantile plots (Q-Q-plots), scatter plots, box plots, stratification, log transformation, and other summary statistics ( Komorowski et al., 2016 ). Qualitative models can be classified into qualitative causal models and abstraction hierarchies. The causal models can be classified into Digraphs, Fault Trees, and Qualitative Physics. Abstraction hierarchies consist of two important components: structural and functional ( Venkatasubramanian et al., 2003 ).

Data mining is a set of methods that extracts certain information from large and complex databases. Data discovery uses automated, software-based techniques to eliminate randomness and uncover hidden patterns and trends ( Fayyad and Simoudis, 1997 ). The classification of data mining techniques is summarized in Table 1 ( Zaki and Ho, 2000 ), including a straightforward description of the method, common analytical techniques, the definition of relevant application areas and examples related to climate studies.

www.frontiersin.org

Table 1 . Data mining techniques and areas of application.

Classification is fundamental in terms of data mining techniques ( Zaki and Ho, 2000 ). Classification models define the similarity structure of the variables and are partitioned into groups (classes) ( Aggarwal, 2015 ). In Big Data-based climate studies, classification models and techniques are greatly utilized. Two streams with different hydroclimatologies were studied in the United States using an artificial neural network (ANN). The analysis identified a large effect on a variety of factors such as average runoff, flow variability, flood frequency and baseline flow stability ( Poff et al., 1996 ). To overcome the great uncertainties inherent in climate models, an alternative neural network-based climate model has been developed that increases the efficiency of large climate model sets by at least one order of magnitude. Based on this, it can be concluded that heating exceeds the surface heating range estimated by the IPCC for almost half of the members of the ensemble ( Knutti et al., 2003 ). This neural network is an effective tool for dealing with such difficult and challenging problems, moreover, has been widely used to explore the mechanisms of climate change and predict trends is climate change that take full advantage of the unknown information hidden in climate data, however, it cannot decipher it.

General Circulation Models (GCMs)—the most advanced tools for estimating future climate change scenarios- operate on a coarse scale, which can be downscaled by support vector machine (SVM) approaches, training meteorological subdivisions (MSDs) and developing a downscaling model (DM) that has been shown to be better than conventional downscaling using multilayered regenerative artificial neural networks ( Tripathi et al., 2006 ). The utilization of solar energy is evolving dynamically in connection with SDG 7, but power plant performance may fluctuate due to the diversity of meteorological conditions, which can be compensated by satellite imagery and SVM learning scheme to predict the motion vector of clouds ( Jang et al., 2016 ). Object-based image analysis (OBIA) and support vector machine (SVM) combined with a decision-tree classification are suitable for mapping mangrove areas that was impossible by traditional remote sensing methods other than rough spatial resolution ( Heumann, 2011 ). Decision tree algorithms consistently outperform maximum likelihood and linear discriminant function classifiers in terms of land cover mapping problems classification accuracy ( Friedl and Brodley, 1997 ). Using a weather-generating model,which allows the nearest neighbor to be re-sampled by disturbing historical data, it is possible to create a set of climatic scenarios based on probable climatic scenarios to produce meteorological data that can be used to assess the vulnerability of the river basin to extreme events ( Sharif and Burn, 2006 ). The ability of the Bayesian Network (BN) to predict long-term changes in the shoreline associated with rises in sea level and quantitatively estimate forecast uncertainty renders it suitable for research into the effects of climate change ( Gutierrez et al., 2011) . It has been used successfully to assess the effects of climate change disturbances on the structure of coral reefs ( Franco et al., 2016 ) and in terms of belief updating concerning the reality of climate change in response to presenting information concerning the scientific consensus on anthropogenic global warming (AGW) ( Cook and Lewandowsky, 2016 ). Using genetic algorithm and occurrence data from museum specimens, ecological niche models were developed for 1,870 species occurring in Mexico and projected onto two climatic surfaces modeled for 2055 ( Peterson et al., 2002 ). A multi-objective genetic algorithm for optimizing water distribution systems (WDS) was used as a discovery tool to examine trade-offs between traditional economic goals and minimize greenhouse gas emissions ( Wu et al., 2010 ). The European territory was subdivided into similar regions of predicted climate change based on simulations of total daily precipitation as well as recent (1986–2005) and long-term future (2081–2100) temperatures using K-mean cluster analysis ( Carvalho et al., 2016 ). An automated procedure based on a cluster initialization algorithm is proposed and applied to changes in the 27 climatic extremes. The proposed method requires, on average, 40% fewer scenarios to meet the 90% threshold than k-means clustering ( Cannon, 2015 ).

Clustering-based analyses are widely accepted data mining techniques, however, improvements in terms of time and cost savings are constantly required due to the management of an increasing amount of data ( Shirkhorshidi et al., 2014 ). Regarding its usage in climatic analyses, a clustering-based spatio-temporal analysis framework of atmospheric data was developed to support both governmental and industrial decision-making processes ( Cuzzocrea et al., 2019 ). To assess erosivity risk, clustering and classification analyses were applied on the national level in Turkey, moreover, an artificial neural network-based prediction was also made. The results identified an increasing risk of soil erosion in the southern and western regions of Turkey, which demands erosion control practices ( Aslan et al., 2019 ). Research has been conducted to regionalize Europe according to similar surface temperatures based on data between 1986 and 2005. The differences between long-term predictive data (CMIP5) and historical data were analyzed with k-means clustering analyses to determine grid points ( Carvalho et al., 2016 ). A fuzzy c-means approach regionalization was determined in western India for the analysis of meteorological drought homogeneous regions to provide effective support for water resources planning and management during droughts ( Goyal and Sharma, 2016 ). Clustering techniques can support simulation and predict models by grouping large-scale data. “Wind energy production is expected to be affected by shifts in wind patterns that will accompany climate change.” In California, wind patterns have been clustered using model simulations from the variable-resolution Community Earth System Model (VR-CESM) and analyzed according to the change in the frequency of clusters and changes in winds within clusters. The changes in capacity factor have significant influence with regard to energy generation ( Wang M. et al., 2020 ).

Regression analysis sought to reveal functional relationships between variables that can further support predictive and forecasting models. Urbanization tends to have a significant impact on climate change, as underlined by an Australian study which determined that changes in land use and vegetation as a result of shifts in urbanization that affect the local climate and water cycle as well as its impacts are considered to be local specific ( Maheshwari et al., 2020 ). Multiple regression-based analysis has been used to determine flood risk in urban catchments by combining multiple linear regression, multiple nonlinear regression and multiple binary logistics regression. This framework sought to support action plans concerning drainage management and maximize the impacts of flood susceptibility strategic implementations ( Jato-Espino et al., 2018 ). Regarding water management, the influence of climate change on the hydrological cycle in the Yangtze River Basin has been analyzed using a regression analysis model and geographic information system ( Keliang, 2019 ). Soil plays a significant role in carbon sequestration, therefore, moderate undesired climatic effects. A model has been designed regarding the top 25 cm of topsoil of the Sierra Morena (Red Natura 2000) area to determine the relationship between independent variables and soil organic carbon (SOC), moreover, by the use of multiple linear regression analysis examined the effects of these variables on SOC content. The results indicated that “SOC in a future scenario of climate change depends on average temperature of coldest quarter (41.9%), average temperature of warmest quarter (34.5%), annual precipitation (22.2%), and annual average temperature (1.3%).” The comparison between the current (2016) and future situations reflects a reduction of 35.4% SOC content and a trend in northward migration ( Olaya-Abril et al., 2017 ).

Frequent itemset/pattern mining is a commonly used technique to extract knowledge from databases. The handling of an increasing amount of heterogeneous data is becoming ever more difficult, therefore, “an efficient algorithm is required to mine the hidden patterns of the frequent itemsets within a shorter run time and with less memory consumption while the volume of data increases over the time period” ( Chee et al., 2019 ). Association rule mining (ARM) models have been built for atmospheric environment monitoring based on the Apriori algorithm and D-S theory/ER algorithm. These techniques provide both technical and theoretical support to prevent as well as manage air pollution ( Li et al., 2019 ). Association rule mining has also been used in terms of monitoring weather behavioral data to develop a prediction model for climate variability ( Rashid et al., 2017 ). Furthermore, climate variability has an impact on agriculture, which demands a greater understanding with regard to the impact of the climate on crop production and food security. Therefore, the impact of seasonal rainfall on rice crop yield was determined based on ARM techniques ( Gandhi and Armstrong, 2016 ). For the understanding of wind conditions, multidimensional sequential pattern mining is used that can define which pattern is suitable for wind energy (by taking into consideration the factors of space, time, and height). According to a study on the Netherlands, 68.97% of the country covered by a suitable wind pattern (at 128 m) and already has wind turbines installed ( Yusof et al., 2017 ). A spatio-temporal pattern-based sequence classification framework was built to estimate the extent of deforestation. This approach was applied on a Tunisian case study that took into consideration 15 years of satellite images and historical wildfire GIS data ( Toujani et al., 2020 ).

Visualization methods sought to explore the interconnections between data by simplifying multivariate data. Self-organizing map neural network (SOMN) method has been used to analyse anomalous atmospheric circulation patterns in China with regard to surface temperature anomalies between 1979 and 2017 ( Gao et al., 2019 ). This method is greatly used for mapping changes, e.g., regarding urban flood hazards ( Rahmati et al., 2019 ). A study on the city of Amol in Iran was conducted and according to the aforementioned model of urban flood hazard mapping, 23% of the land area of the city is expected to high or very high levels of flood risk, which demands efficient flood risk management. SOMN and grid cells method were applied to determine changes in spatio-temporal land cover in Inner Mongolia between 2004 and 2014 ( Li et al., 2018 ). The Principal Component Analysis (PCA) technique has been used to assess the vulnerability of the coastal region of Bangladesh while taking into consideration the IPCC framework. The study used 31 indicators (24 socio-economic, 7 natural). PCA was applied and determined seven eigenvectors [Demographic Vulnerability (PC1), Economic Vulnerability (PC2), Agricultural Vulnerability (PC3), Water Vulnerability (PC4), Health Vulnerability (PC5), Climate Vulnerability (PC6), and Infrastructural Vulnerability (PC7)] that take into consideration climate change scenarios from 2013 to 2050 ( Uddin et al., 2019 ). PCA has also been used to build the composite drought vulnerability index ( Balaganesh et al., 2020 ).

4. Systematic Review of Climate Change-Related Analyses

4.1. overview of big data-based climate change analysis.

The significance of Big Data in climate-related studies is greatly recognized and its techniques are widely used to observe and monitor changes on a global scale. It facilitates understanding and forecasting to support adaptive decision-making as well as optimize models and structures ( Hassani et al., 2019 ).

Review articles can provide a better organized structure of previous studies, so the major focus areas are determined with regard to previous review articles concerning the connection between climate change and Big Data. The major objective is to reveal how diverse disciplines appears in the related researches, therefore narrowing when and how Big Data applications and the relation with data science are appeared in climate studies.

A comprehensive overview was conducted based on the Scopus database. Fifty-seven articles were retrieved from the following search: [TITLE-ABS-KEY(“climate change”) AND TITLE-ABS-KEY(“Big Data”)] AND [TITLE-ABS-KEY(“overview”) OR TITLE-ABS-KEY(“review”)].

Articles were reviewed and selected individually for the final sample. Table 2 shows the number of articles selected and excluded.

www.frontiersin.org

Table 2 . Selection of articles related to the review of climate data.

The 47 articles of the final sample are shown in Tables 3 – 5 , where a straightforward description and focus area of the research are indicated as well as categorized accordingly. It is notable that mostly specific climate issues are observed (e.g., decarbonization of energy or land ecosystem) and their potential with regard to Big Data determined. The two most affected categories are agriculture and studies of sustainable cities and communities. This is a good illustration of how intertwined research on climate action is with sustainable development goals.

www.frontiersin.org

Table 3 . Overview of articles analysing Big Data usage with climate change issues categorized into the domains of Agriculture, Cleaner production, and Climate resilience.

www.frontiersin.org

Table 4 . Overview of articles analysing Big Data usage in terms of climate change issues categorized into the domains of Cyberinfrastructure (IoT), Impact assessment and Methods.

www.frontiersin.org

Table 5 . Overview of articles analysing Big Data usage in terms of climate change issues categorized into the domains of Sustainable cities and communities, Water, and Biodiversity.

The quality and safety of agricultural products can be assured through solutions provided by the Internet of Things (IoT) and cloud computing ( Marcu et al., 2019 ). Remote sensing and Artificial Intelligence technologies enables to integrate Big Data into predictive and prescriptive management tools, to improve e.g., the resilience of agricultural systems ( Jung et al., 2020 ). Big Data virtualization in the field of agriculture enables physical objects to be virtualized, e.g., sensors and devices used for defining soil moisture, water flows, or salinity, where these objects can provide diverse meaningful information in each phase of a data chain to support decision-making and information handling ( Mathivanan and Jayagopal, 2019 ). Furthermore, Big Data techniques are utilized in terms of plant breeding ( Taranto et al., 2018 ), crop ideotypes for food security ( Christensen et al., 2018 ), or in precision agriculture framework ( Demestichas et al., 2020 ). Climate Smart Agriculture framework aims to enhance the capacity of the agricultural systems to support food security, supporting adaptation, and mitigation into sustainable agriculture development through latest technologies as IoT, AI, geo-informatics, and Big Data analytics ( Gulzar et al., 2020 ). The interdisciplinary and systematic approach of soil use and management to achieve related sustainability goals has also been explored ( Hou et al., 2020 ).

Alignment with regard to the focus area of sustainable cities and communities with the 11th sustainable development goal (Sustainable cities and communities) has been explored through reviews. Big Data management can enhance the opportunity for organizations to respond to the risk of climate change in time ( Seles et al., 2018 ) as well as offers possibilities to consider sustainable production and lower emission rates. Furthermore, machine learning can be effectively utilized for low-carbon urban planning ( Milojevic-Dupont et al., 2020 ). Outside the field of industry, co-operation, legislation, and environmental agreements are essential to realize a sustainable manufacturing environment ( Hämäläinen and Inkinen, 2019 ). The concept of smart cities seeks to overcome and prevent climate change and issues concerning urbanization ( Sharifi, 2019 ), moreover, smart transportation policies can utilize the advantages of Big Data ( De Gennaro et al., 2016 ). In this smart environment, civil engineers are seen as future risk and uncertainty managers to improve community resilience through smart infrastructure programs ( Berglund et al., 2020 ).

Climate resilience studies assess how to prepare for, recover from and adopt to climate-related risks ( Center for Climate and Energy Solutions, 2019 ). Big Data seeks to support these activities by providing a large volume, variety, and quality data to reveal patterns and enables data democratization ( Faghmous et al., 2014 ). Therefore, Big Data approach can serve as a source of key information for decision-makers in terms of creating and adapting appropriate strategies, determining current, and upcoming issues, as well as identifying stages of recovery for taking actions in time ( Sarker et al., 2020 ). News media can serve as a near-real-time geolocated information, which can support the understanding of social movements and early-warning systems. “Combining news media with social and biophysical data is important to verify results and limit biases in analysis” ( Buckingham et al., 2020 ). One of the issues concerning urban environments is energy efficiency and carbon emissions, for which net zero energy movements seek to bring about a solution as well as the application of a resilience ecological framework for net zero energy research ( Hu and Pavao-Zuckerman, 2019 ). Furthermore, Big Data techniques with regard to machine learning enable the attitude of people toward and recognition of environmental changes to be determined ( Park et al., 2020 ). Big Data and machine learning approaches are vital in comprehensively merging heterogeneous genomic and ecological datasets ( Cortés et al., 2020 ).

However, review articles have explored the potential for utilizing Big Data techniques in diverse areas, moreover, comprehensive overviews about climate change are becoming less of a focus. Even though data-intensive research applications may seems to be unbalanced among disciplines ( Hassani et al., 2019 ), the dynamism and complexity of climate issues must not be neglected. This complexity brings about an interdisciplinary approach and the intertwining of diverse disciplines, to which the System of Systems concept (climate computing) is the urgent answer.

4.2. Meta-Analysis With Regard to the Methods of Climate-Related Analyses

Co-word analysis examines the relationships between keywords to reveal the structure and development of methodologies or applications. The relationships between keywords in research papers “contains valuable information about knowledge structure of the field, its relevant concepts, and their connections” Lozano et al. (2019) . It is our aim to determine diverse focus areas, methodologies and techniques regarding Big Data-driven climate change analyses and harmonize these to allow better utilization of the achieved field-specific results.

The Scopus database was used to identify the corresponding papers using the following search: [TITLE-ABS-KEY(“climate change”) AND TITLE-ABS-KEY(“Big Data”)]. As a result 442 articles were retrieved and the co-occurrence of their keywords analyzed using VOSviewer. The time period in which the papers were written was between the years 2012 and 2020. In Figure 3 , seven clusters are indicated by a diverse range of colors that overarch topics related to climate change and application methods of Big Data.

www.frontiersin.org

Figure 3 . The network of keywords co-occurrence in climate-related Big Data articles.

Each cluster refers to a focus area including its attributes of interrelationships as well as methodologies and techniques applied in the field.

The “Red” cluster denotes the connections between Big Data technologies and methods applied for optimization procedures, measures the impact of climate change and resilience as well as makes predictions. Technologies are considered, e.g., artificial intelligence, learning algorithms such as machine learning and deep learning, data analytics, neural networks, and cluster computing. Neural networks are used to analyse climate change, weather prediction, and visualization ( Buszta and Mazurkiewicz, 2015 ), while machine learning techniques are used for intelligent recognition ( Demertzis and Iliadis, 2016 ) and to define the impact of climate change and resilience ( Rolnick et al., 2019 ). In addition, they are used to predict epidemics and diseases in both social ( Rees et al., 2019 ) and environmental contexts e.g., in the case of crops ( Fenu and Malloci, 2019 ), coffee disease and pest ( Lasso and Corrales, 2017 ), or pedotransfer functions ( Benke et al., 2020 ). Clustering techniques on cloud computing infrastructure have been applied, e.g., to map changes in glaciers ( Ayma et al., 2019 ). A novel machine learning approach has been developed by the U.S. Department of Energy's National Renewable Energy Laboratory using adversarial training in climate forecasting, in which the model provides a “physics-informed variation to the super resolution generative adversarial network (SRGAN) model, which extends proven performance on super resolution of natural images to scientific datasets” ( Stengel et al., 2019 ). This breakthrough is capable of saving computational time and data storage, moreover, can provide more accessible high-resolution climate data that can be utilized in a wide range of climate scenarios. These techniques seek to assess risk management in terms of human and environmental health by providing vital information concerning the present conditions and making predictions about the future.

Keywords included in the “orange” cluster, mainly describe agriculture-related climate issues and adaptations. IoT technologies, information systems and sensor networks tend to be applied in a field. Big Data increase the heterogeneity “across farms, farmers, climates, crops, soils, natural resources, models, management strategies and outcomes, post production value chain system, and other economic variables of interest” that can boost knowledge with regard to the concept of climate-smart agriculture ( Rao, 2018 ). IoT technologies have been proven to be beneficial in improving efficiency in the complex field of agriculture. Sensors are used to collect vital information about soil, fertilizer, moisture, sunshine, temperature, and geographic information of farmland for monitoring as well as to link to other databases for identifying attributes ( Yan-e, 2011 ). The combination of automation and IoT technologies broad perspectives in smart agriculture, as remote controlled robots to perform tasks, smart and intelligent decision making based on real time data as well as warehouse management ( Gondchawar and Kawitkar, 2016 ).

The “purple” cluster represents natural disasters caused by climate change, e.g., floods or deteriorating air quality, and the related risk management. Decision-making processes are supported by data mining techniques and statistical as well as spatial analysis. The frequency of natural disasters in the Philippines increased by 147% from 1980 to 2012 and continues to rise ( Garcia and Hernandez, 2017 ). Big Data through data mining plays a significant role in creating real-time feedback loops on natural disasters to support disaster management in prevention, protection, mitigation processes as well as response and recovery, moreover, in increasing the resilience of citizens ( Yang et al., 2017 ).

“Light blue” clusters climate models that define interactions of the drivers of climate change. Topics like ecology, biodiversity, vulnerability, and the issue of water resources are included. Big Data-based techniques are widely used and the importance of open data must be recognized. Cloud computing and uncertainty analysis tend to support the modeling of life cycles and climatic effects. The open data science approach ensures a transparent and collaborative environment for multi-model climate change data analytics ( Fiore et al., 2018 ). Information about the geographic distribution of greenhouse gas emissions can be useful in terms of high-resolution modeling ( Charkovska et al., 2019 ).

The “green” cluster defines topics with regard to sustainable development, dealing with gas emissions, greenhouse gases, energy efficiency, and environmental policies. Information analytics and environmental technologies as well as green computing seek to minimize hazardous waste while maximizing energy efficiency and recyclability to foster the concept of a circular economy. Data mining, generic algorithms, and neural networks are gradually applied in sustainable consumption research, that enables more accurate and better visualized results ( Wang et al., 2019 ). Managing efficient energy use is a commonly discussed issue that takes into consideration the climate change impact analysis with regard to the energy use of campus buildings ( Fathi and Srinivasan, 2019 ), life-cycle assessment of energy-consuming products ( Ross and Cheah, 2019 ) as well as the adaptation of green computing to reduce the carbon footprint of ICT ( Airehrour et al., 2019 ).

The “blue” cluster seems to reveal methodologies considered in climatology, urbanization, and adaptive management. Remote sensing and satellite imagery make it possible to collect a large amount of data that supports mapping and is used to make further predictions. Satellite remote sensing quantifies processes and spatio-temporal states of the atmosphere, land, and oceans ( Yang et al., 2013 ), moreover enables, for example, climate change and the impact of human activities on cropland productivity to be detected ( Yan et al., 2020 ) and changes in water resources to be mapped ( Senay et al., 2017 ). The monitoring of carbon by satellite observation provides information about greenhouse gases and emissions that can be utilized in estimation processes regarding the investigation of CO 2 ( Zhao et al., 2019 ).

The “yellow” cluster consists of the global climate change-related data analyses, visualization methods, regression analysis, and time series analysis. Open systems and open sources are gaining ever more attention in this field. A web-based visualization of complex climate data can assure scientists, resource managers, policymakers, and the public to explore climate-balance projections even at the local level ( Alder and Hostetler, 2015 ). The assessment of spatiotemporal data to gain knowledge from it is a complex challenge, however, a well-developed visual analytical system can support performance improvement methods and techniques ( Li et al., 2013 ). A high performance query analytical framework that proposes grid transformation can provide a complex climate data observation and model simulation ( L et al., 2017 ). For climate environmental analyses, a 3D visualization simulation of cloud data is gaining attention in the fields of computer graphics and meteorology ( Xie Y. et al., 2019 ).

The application of contemporary technologies like Big Data analytics and IoT-based models is sought to gain a knowledge base in any field by collecting and analysing large complex heterogeneous data sets. This enables evidence-based policy making to be encouraged and serves as a decision support tool for risk assessment and resilience adaptation, while forecasting future socio-economic as well as aiding environmental conditions caused by climate-related change. The Big Data researches are important in itself and contribute to the understanding of climate change, but managing their results in an integrated way increases the level of problem extraction and provides new solutions for decision makers.

4.3. The Role of Social Sciences in Climate Change Studies

Most articles on climate change belong to the field of environmental science, closely followed by Earth and planetary sciences, then agricultural and biological sciences. Interestingly, the number of articles published in the social sciences precedes the fields of engineering and energy.

The growing amount of information and knowledge renders multidisciplinary analyses covering the whole field of science and the development of such analytical tools indispensable as the knowledge accumulated cannot be directly utilized without systematization and targeted processing.

Climate change issues tend to connect different disciplines as well as research ideas, models, and solutions related to these issues. In the following, significant connection between climate and social sciences is discussed. The Scopus database was used to extract relevant information for meta-analysis.

The search for a connection with social sciences yielded 1,203 documents: [TITLE-ABS-KEY(“climate change”) AND TITLE-ABS-KEY(“social sciences”)]. The networks concerning the co-occurrence of keywords referring to the interrelationship between climate change and social sciences is shown in Figure 4 .

www.frontiersin.org

Figure 4 . The relation between social sciences and climate change.

Based on the intersections presented in Figure 4 , seven communities are detected. The red community includes emissions, energy and economic hubs. The yellow community includes habitat-related nodes. The light blue community covers regulators and issues concerning water management, while the purple community summarizes concepts related to “change,” e.g., vulnerability, adaptation, etc. The green community includes interdisciplinary subject areas, while the dark blue one represents political keywords and the orange community describes sustainable mergers.

A complex relationship exists between human and natural processes involving social, political, geographic, and cultural contexts that demands a multidisciplinary concept ( Fiske et al., 2018 ). Environmental changes call for socio-economic transformation to mitigate the effects caused by humans and increase resilience. Changes are observed in a diverse range of areas such as agriculture and food security, air quality, waters, energy consumption, land ecosystem as well as global warming. These issues must be managed through strategic planning and management with a high degree of focus on long-term sustainable operation. Socio-ecological-economic models must integrate social and biophysical information in order to develop sufficient mitigation and adaptation strategies ( Sullivan and Huntingford, 2009 ). The impact of climate change on water resources is critical as it is related to floods, droughts, tidal waves, and humidity. Big Data-based processes are used to determine, for example, soil conditions and humidity ( Anton et al., 2019 ) to estimate energy consumption ( Seyedzadeh et al., 2018 ) or greenhouse gas emissions ( Hamrani et al., 2020 ) that enable optimal processes and interventions to be predicted. Decision support algorithms, models, and databases are used to provide evidence-base for policymaking and legislation ( Aragona and De Rosa, 2019 ) as well as disaster management ( Akter and Wamba, 2019 ). These can be considered at organizational ( Kouloukoui et al., 2019 ), local ( Giest, 2017 ), sub-national ( Hsu et al., 2019 ), national ( Iacobuta et al., 2018 ), or even global levels ( Flato et al., 2014 ).

Socio-environmental sciences are sought to explore the systematic cause-effect relationship following the environmental impact of human induced climate change. By providing heterogeneous data and supportive models, positive changes can be achieved through interdisciplinary data-driven perceptions that contribute to a better understanding of the complex issue, monitor changes, support decision-making, and bring about in-time interventions.

4.4. The Importance of the System of Systems Approach

Climate change is one of the most significant global challenges that need to be managed. To resolve any of the climate change-related challenges, “it is essential to elicit and integrate knowledge across a range of systems, informing the design of solutions that take into account the complex and uncertain nature of the individual systems and their interrelationships” ( Little et al., 2019 ). The system of system (SoS) framework enables to analyse the interdependencies between various systems (e.g., human, information, environmental, and physical systems), therefore provides a clear understanding of the complex nature of the issue ( Fan and Mostafavi, 2019 ). The trends in data science and information technology ( Tannahill and Jamshidi, 2014 ) supports the integration of various disciplines and research outcomes to represent a socio-environmental system holistically inform policy and decision-making processes ( Iwanaga et al., 2020 ) , which can be referred as climate computing.

To highlight the importance of the application of the system of systems approach, the latest Big Data-based works in the field of climate change were reviewed, based on which we identified a SoS framework ( Figure 5 ). In the network of applications, the nodes show the different researches, and the edges represent the relationships of the research results. The BigData applications have been grouped according to sustainable development goals, thus showing the possible scientific contributions with the other fields.

www.frontiersin.org

Figure 5 . The system of systems concept of BigData applications.

By processing satellite data, the system developed in Semlali and El Amrani (2021) can monitor changes in air quality, which can also be used to monitor agricultural areas ( Majidi et al., 2021 ). Cloud tracking ( He et al., 2020 ) further helps to assess the evolution of air pollution, the reliability of which can be further enhanced with statistical downscaling solutions ( Wang Q. et al., 2020 ). The time-series data ( Joshi et al., 2019 ) extracted from satellite images support long-term forecasts, but the description of cloud motion ( Xie Y. et al., 2019 ) can also be used to refine shorter-term analyzes. The use of satellite imagery as a data source in urban planning also helps identify climate-friendly solutions ( Milojevic-Dupont et al., 2020 ).

Web-based water management ( Mourtzios et al., 2021 ) can be supported with trends identified from time-series data ( Ise et al., 2020 ), but remotely sensed water flow data also complements the agricultural water management model ( Ismail et al., 2020 ). And if we increase the resolution of the data ( Jimenez et al., 2019 ), we can also understand the causal relationships related to consumption. In terms of infrastructure load, patterns of population movement ( Gurram et al., 2019 ) offer exciting opportunities, but can also be integrated with the condition of buildings ( Gouveia and Palma, 2019 ), which also supports the satisfaction of urban planning tasks ( Milojevic-Dupont et al., 2020 ) at a higher level.

Agricultural satellite imagery applications ( Majidi et al., 2021 ) can be transferred to air quality satellite monitoring ( Semlali and El Amrani, 2021 ), or time-series data ( Ise et al., 2020 ) can be used to plan better agricultural interventions. By implication, satellite-based support plays an important role in modeling agricultural water management ( Ismail et al., 2020 ), but disaster news ( Park et al., 2020 ) also helps provide a deeper understanding of social involvement. In assessing disaster resilience in different areas, ( Sasaki et al., 2020 ) satellite imagery provides feedback on risks that can even be revealed over time ( Joshi et al., 2019 ). Satellite-based results can be supported by on-site special ( Lambrinos, 2019 ) and meteorological ( Mabrouki et al., 2021 ) sensor data, and flood protection of valuable agricultural areas can also be planned with flood models ( Avand et al., 2021 ).

Identifying patterns in time-series data ( Ise et al., 2020 ) helps with research in many other areas, whether it is agricultural water management ( Ismail et al., 2020 ) or marine habitat protection ( Coro et al., 2020 ). It allows ( Kubo et al., 2020 ) forecasting and a better understanding of coastal traffic and increases the reliability of disaster resilience estimation ( Sasaki et al., 2020 ). By extracting time series data ( Joshi et al., 2019 ) from satellite imagery, we can indirectly validate the models by comparing the time series or identify the factors of potato disease ( Fenu and Malloci, 2019 ). In urban developments ( Milojevic-Dupont et al., 2020 ) and in building condition surveys ( Gouveia and Palma, 2019 ) the forecast shows the development of infrastructure expansion and maintenance, to which the probability of flood protection problems ( Avand et al., 2021 ) can also be linked.

Statistical downscaling ( Wang Q. et al., 2020 ) helps to find the external variables of Mourtzios et al. (2021) consumption patterns identified based on remote sensing and is comparable with the results of satellite image-based analyzes ( Semlali and El Amrani, 2021 ). And comparable to other approaches ( Jimenez et al., 2019 ), which strengthens confidence in the models ( Qin and Chi, 2020 ). Better resolution data supports marine habitat protection planning ( Coro et al., 2020 ), risk assessment input ( Fenu and Malloci, 2019 ), but can also be used ( Gouveia and Palma, 2019 ) to analyze building consumption data. The efficiency of downscaling techniques can be increased with the Internet of Things ( Lambrinos, 2019 ) toolbar. The increase of the number of observations allows a more accurate description of local climatic conditions to estimate floods ( Avand et al., 2021 ) and heat island effects, as well as other sustainable urban planning ( Milojevic-Dupont et al., 2020 ) aspects.

Coastal tourism monitoring ( Kubo et al., 2020 ) can be integrated with traffic data ( Hu et al., 2020 ) to optimize traffic management and thereby reduce pollutant emissions. The effect of transport on plant damage can be included ( Meineke et al., 2020 ) as a factor to be analyzed, or we can use it ( Gurram et al., 2019 ) to identify patterns in population movement.

Population movements ( Gurram et al., 2019 ) affect water consumption ( Mourtzios et al., 2021 ), can damage plants ( Meineke et al., 2020 ), show the popularity of coastal areas ( Kubo et al., 2020 ), but are also suitable for improving transport planning ( Hu et al., 2020 ). Because the movement of residents is closely related to the infrastructure ( Milojevic-Dupont et al., 2020 ), it is a very valuable input in urban planning.

The data of the Internet of Things sensors ( Mabrouki et al., 2021 ) allow the conclusions drawn from the satellite images to be verified ( Majidi et al., 2021 ), as a measuring station ( Jimenez et al., 2019 ) increases the number of observations, thus better downscaling solutions ( Wang Q. et al., 2020 ) can be made. It can be used for causal exploration of plant morphological damage ( Fenu and Malloci, 2019 ) and supports agricultural irrigation water demand planning ( Ismail et al., 2020 ), but can also be imported into flood models ( Avand et al., 2021 ).

In the Big Data application, that supports the energy demand management of buildings ( Gouveia and Palma, 2019 ), we can use water consumption data ( Mourtzios et al., 2021 ) as an extension, development alternatives can be ranked based on time series data ( Ise et al., 2020 ), or based on time series extracted from satellite images ( Joshi et al., 2019 ), which can be supported by a deeper understanding of energy demand downscaled data ( Wang Q. et al., 2020 ), because the resolution of the input data can be improved ( Jimenez et al., 2019 ).

Based on the presented system of systems framework, it can be seen how the new results of Big Data applications related to climate change contribute to other areas. Remote sensing of water consumption ( Mourtzios et al., 2021 ), analysis of cloud water content ( He et al., 2020 ), and the agricultural water management model ( Ismail et al., 2020 ) contribute to the goal of clean water and sanitation (SDG6). Planning based on the analysis of traffic data ( Hu et al., 2020 ), studying population movements ( Gurram et al., 2019 ) and flooding models ( Avand et al., 2021 ) support the goal of industry, innovation and infrastructure (SDG9). Climate-friendly urban planning ( Milojevic-Dupont et al., 2020 ), monitoring the energy demand of buildings ( Gouveia and Palma, 2019 ), and defining disaster resilience ( Sasaki et al., 2020 ) play an important role in achieving sustainable cities and communities (SDG11). The Climate Action goal (SDG13) tackles most data gaps, so research such as linking satellite images to Semlali and El Amrani (2021) with air quality, preprocessing them ( Meraner et al., 2020 ; Qin and Chi, 2020 ; Semlali et al., 2020 ), the analysis of time series data ( Ise et al., 2020 ) and its exploration ( Joshi et al., 2019 ), downscaling ( Wang Q. et al., 2020 ) techniques, enrichment of precipitation and temperature data ( Jimenez et al., 2019 ), tracking the movement of clouds ( Xie Y. et al., 2019 ), or just using IoT sensors ( Mabrouki et al., 2021 ) are all key in creating a strategy to support the achievement of the climate goal. For the sustainability of life below water (SDG14), marine life prediction models ( Coro et al., 2020 ) and human coastal activity ( Kubo et al., 2020 ) can be integrated. Of course, the goal of life on land (SDG15) also requires new research, where a satellite-based study of agriculture and forestry ( Majidi et al., 2021 ), deployment of IoT sensors ( Lambrinos, 2019 ), analysis of climatic factors of potato damage ( Fenu and Malloci, 2019 ), studying the morphology of plants ( Meineke et al., 2020 ), or social media based illustration of palm oil consumption ( Teng et al., 2020 ) are promising. Partnerships for the goals (SDG17) is critical in several ways, on the one hand we recommend the grouping of climate services ( Howard et al., 2020 ), which fits the SoS concept we propose, and on the other hand we need to integrate the knowledge and give feedback to society. An exciting tool for measuring the effectiveness of climate and sustainability related measures is the analysis of news comments ( Park et al., 2020 ).

It is essential to highlight that Big Data research on climate change can be used in other areas and as shown by the SDG grouping in Figure 5 . Thus, based on the recommended SoS viewpoint, the specific results of sustainability-related research and development projects can be integrated, enhancing knowledge accumulation and utilization.

5. Discussion

This paper described the essential need for research and development objectives to realize and manage the complex issues of climate change through Big Data tools. Data-driven applications were reviewed through the co-occurrence analysis of keywords, which showed the widespread application of Big Data technologies and tools, however, comprehensively utilized and integrative analyses are less prevalent.

This research aimed to highlight the perspective of systems of systems (SoS) as the drivers and effects of climate as well as that their resilience and adaptation cannot be determined without the exploration of the synergies between new research trends and disciplines. Based on the recommended SoS viewpoint, the specific results of sustainability-related research and development projects can be integrated, enhancing knowledge accumulation and utilization. The tools of data and systems sciences can play a crucial role in recognition of climate challenges and mitigation opportunities thanks to the integration of heterogeneous data and models, and the exploration of the relationship between environmental and social factors. This integrated thinking lays the groundwork for promising future trends in climate computing.

It can be claimed that the exclusive analysis of climatic factors cannot bring about sufficient strategic adaptation by itself, rather the socio-environmental factors must be integrated the climate change models.

Mitigating the impacts of climate change and successful adaptation requires effective climate change strategic planning by countries worldwide whose decision-making requires complex models and sources of information. The Big Data toolkit enables the systematization, processing, and evaluation of heterogeneous data and information sources, which is unfeasible with traditional disciplinary analysis tools. The harmonization of the ever-expanding scientific knowledge and diversified data sources related to climate change may be one of the most urgent tasks for researchers in the future. This research presented Big Data analytics tools and their contribution toward exploring the characteristics of climate change as well as climate action-related counterparts such as sustainability and social sciences that are essential for the successful development and implementation of strategies.

Author Contributions

VS: conceptualization, validation, investigation, writing-original draft, visualization. JA: conceptualization, validation, resources, writing-review and editing, supervision, and funding acquisition. TC: writing-original draft, investigation, visualization, and validation. All authors contributed to the article and approved the submitted version.

This research was funded by the National Laboratory for Climate Change (NKFIH-872 project). We acknowledge the financial support of Széchenyi 2020 under the GINOP-2.3.2-15-2016-00016.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Aggarwal, C. (2015). “Data classification,” in Data Mining. Cham: Springer. doi: 10.1007/978-3-319-14142-8_10

CrossRef Full Text | Google Scholar

Airehrour, D., Cherrington, M., Madanian, S., and Singh, J. (2019). “Reducing ICT carbon footprints through adoption of green computing,” in Proceedings of the IE 2019 International Conference (IE 2019) , ed F. G. Filip (Bucharest), 257–263.

Google Scholar

Akter, S., and Wamba, S. F. (2019). Big data and disaster management: a systematic review and agenda for future research. Ann. Oper. Res . 283, 939–959. doi: 10.1007/s10479-017-2584-2

Alder, J. R., and Hostetler, S. W. (2015). Web based visualization of large climate data sets. Environ. Model. Softw . 68, 175–180. doi: 10.1016/j.envsoft.2015.02.016

Allen, J. L., McMullin, R. T., Tripp, E. A., and Lendemer, J. C. (2019). Lichen conservation in North America: a review of current practices and research in Canada and the United States. Biodivers. Conserv . 28, 3103–3138. doi: 10.1007/s10531-019-01827-3

Al-Shiakhli, S. (2019). Big Data Analytics: A Literature Review Perspective. Digitala Vetenskapliga Arkivet . Available online at: http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-74173

Anton, C. A., Matei, O., and Avram, A. (2019). “Collaborative data mining in agriculture for prediction of soil moisture and temperature,” in Computer Science On-Line Conference (Cham: Springer), 141–151. doi: 10.1007/978-3-030-19807-7_15

Aragona, B., and De Rosa, R. (2019). Big data in policy making. Math. Popul. Stud . 26, 107–113. doi: 10.1080/08898480.2017.1418113

Ardabili, S., Mosavi, A., Dehghani, M., and Várkonyi-Kóczy, A. R. (2019). “Deep learning and machine learning in hydrological processes climate change and earth systems a systematic review,” in International Conference on Global Research and Education (Cham: Springer), 52–62. doi: 10.1007/978-3-030-36841-8_5

Aslan, Z., Erdemir, G., Feoli, E., Giorgi, F., and Okcu, D. (2019). Effects of climate change on soil erosion risk assessed by clustering and artificial neural network. Pure Appl. Geophys . 176, 937–949. doi: 10.1007/s00024-018-2010-y

Avand, M., Moradi, H. R., and Ramazanzadeh Lasboyee, M. (2021). Spatial prediction of future flood risk: an approach to the effects of climate change. Geosciences 11:25. doi: 10.3390/geosciences11010025

Ayma, V., Beltrán, C., Happ, P., Costa, G., and Feitosa, R. (2019). Mapping glacier changes using clustering techniques on cloud computing infrastructure. Int. Arch. Photogr. Remote Sens. Spat. Inform. Sci . 29–34. doi: 10.5194/isprs-archives-XLII-2-W16-29-2019

Balaganesh, G., Malhotra, R., Sendhil, R., Sirohi, S., Maiti, S., Ponnusamy, K., et al. (2020). Development of composite vulnerability index and district level mapping of climate change induced drought in Tamil Nadu, India. Ecol. Indic . 113:106197. doi: 10.1016/j.ecolind.2020.106197

Balaji, V. (2015). Climate computing: the state of play. Comput. Sci. Eng . 17, 9–13. doi: 10.1109/MCSE.2015.109

Benjelloun, F.-Z., Lahcen, A. A., and Belfkih, S. (2015). “An overview of big data opportunities, applications and tools,” in 2015 Intelligent Systems and Computer Vision (ISCV) (Fez), 1–6. doi: 10.1109/ISACV.2015.7105553

Benke, K., Norng, S., Robinson, N., Chia, K., Rees, D., and Hopley, J. (2020). Development of pedotransfer functions by machine learning for prediction of soil electrical conductivity and organic carbon content. Geoderma 366:114210. doi: 10.1016/j.geoderma.2020.114210

Berglund, E. Z., Monroe, J. G., Ahmed, I., Noghabaei, M., Do, J., Pesantez, J. E., et al. (2020). Smart infrastructure: a vision for the role of the civil engineering profession in smart cities. J. Infrastruct. Syst . 26:03120001. doi: 10.1061/(ASCE)IS.1943-555X.0000549

Bertot, J. C., Gorham, U., Jaeger, P. T., Sarin, L. C., and Choi, H. (2014). Big data, open government and e-government: issues, policies and recommendations. Inform. Pol . 19, 5–16. doi: 10.3233/IP-140328

Bibri, S. E. (2018). The iot for smart sustainable cities of the future: an analytical framework for sensor-based big data applications for environmental sustainability. Sustain. Cities Soc . 38, 230–253. doi: 10.1016/j.scs.2017.12.034

Buckingham, K., Brandt, J., Anderson, W., and Singh, R. (2020). The untapped potential of mining news media events for understanding environmental change. Curr. Opin. Environ. Sustain . 45, 92–99. doi: 10.1016/j.cosust.2020.08.015

Buszta, A., and Mazurkiewicz, J. (2015). “Climate changes prediction system based on weather big data visualisation,” in International Conference on Dependability and Complex Systems (Cham: Springer), 75–86. doi: 10.1007/978-3-319-19216-1_8

Cannon, A. J. (2015). Selecting gcm scenarios that span the range of changes in a multimodel ensemble: application to cmip5 climate extremes indices. J. Clim . 28, 1260-1267. doi: 10.1175/JCLI-D-14-00636.1

Carvalho, M. P., Melo-Gonçalves Teixeira, J., and Rocha, A. (2016). Regionalization of Europe based on a k-means cluster analysis of the climate change of temperatures and precipitation. Phys. Chem. Earth Parts A/B/C 94, 22–28. doi: 10.1016/j.pce.2016.05.001

Center for Climate and Energy Solutions (2019). What Is Climate Resilience and Why Does It Matter ? Center for Climate and Energy Solutions.

Challinor, A. J., Adger, W. N., Benton, T. G., Conway, D., Joshi, M., and Frame, D. (2018). Transmission of climate risks across sectors and borders. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci . 376:20170301. doi: 10.1098/rsta.2017.0301

PubMed Abstract | CrossRef Full Text | Google Scholar

Chapman, J., Power, A., Netzel, M. E., Sultanbawa, Y., Smyth, H. E., Truong, V. K., et al. (2020). Challenges and opportunities of the fourth revolution: a brief insight into the future of food. Crit. Rev. Food Sci. Nutr . 1–9. doi: 10.1080/10408398.2020.1863328

Charkovska, N., Horabik-Pyzel, J., Bun, R., Danylo, O., Nahorski, Z., Jonas, M., et al. (2019). High-resolution spatial distribution and associated uncertainties of greenhouse gas emissions from the agricultural sector. Mitigat. Adapt. Strat. Glob. Change 24, 881–905. doi: 10.1007/s11027-017-9779-3

Chee, C.-H., Jaafar, J., Aziz, I. A., Hasan, M. H., and Yeoh, W. (2019). Algorithms for frequent itemset mining: a literature review. Artif. Intell. Rev . 52, 2603–2621. doi: 10.1007/s10462-018-9629-z

Chen, H., Chiang, R. H., and Storey, V. C. (2012). Business intelligence and analytics: from big data to big impact. MIS Quart . 36, 1165–1188. doi: 10.2307/41703503

Christensen, A., Srinivasan, V., Hart, J. C., and Marshall-Colon, A. (2018). Use of computational modeling combined with advanced visualization to develop strategies for the design of crop ideotypes to address food security. Nutr. Rev . 76, 332–347. doi: 10.1093/nutrit/nux076

Clarke, A., and Margetts, H. (2014). Governments and citizens getting to know each other? Open, closed, and big data in public management reform. Policy Intern . 6, 393–417. doi: 10.1002/1944-2866.POI377

Climate Change (2014). Climate Change 2013: The Physical Science Basis: Working Group I Contribution to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change . Cambridge University Press.

Constantiou, I. D., and Kallinikos, J. (2015). New games, new rules: big data and the changing context of strategy. J. Inform. Technol . 30, 44–57. doi: 10.1057/jit.2014.17

Cook, J., and Lewandowsky, S. (2016). Rational irrationality: modeling climate change belief polarization using Bayesian networks. Top. Cogn. Sci . 8, 160–179. doi: 10.1111/tops.12186

Coro, G., Pagano, P., and Ellenbroek, A. (2020). Detecting patterns of climate change in long-term forecasts of marine environmental parameters. Int. J. Digital Earth 13, 567–585. doi: 10.1080/17538947.2018.1543365

Cortés, J., Restrepo-Montoya, M., and Bedoya-Canas, L. E. (2020). Modern strategies to assess and breed forest tree adaptation to changing climate. Front. Plant Sci . 11:1606. doi: 10.3389/fpls.2020.583323

Craglia, M., Hradec, J., Nativi, S., and Santoro, M. (2017). Exploring the depths of the global earth observation system of systems. Big Earth Data 1, 21–46. doi: 10.1080/20964471.2017.1401284

Creutzig, F., Lohrey, S., Bai, X., Baklanov, A., Dawson, R., Dhakal, S., et al. (2019). Upscaling urban data science for global climate solutions. Glob. Sustain . 2. doi: 10.1017/sus.2018.16

Cuzzocrea, A., Gaber, M. M., Fadda, E., and Grasso, G. M. (2019). An innovative framework for supporting big atmospheric data analytics via clustering-based spatio-temporal analysis. J. Ambient Intell. Human. Comput . 10, 3383–3398. doi: 10.1007/s12652-018-0966-1

De Gennaro, M., Paffumi, E., and Martini, G. (2016). Big data for supporting low-carbon road transport policies in Europe: applications, challenges and opportunities. Big Data Res . 6, 11–25. doi: 10.1016/j.bdr.2016.04.003

Demertzis, K., and Iliadis, L. (2016). “Adaptive elitist differential evolution extreme learning machines on big data: intelligent recognition of invasive species,” in INNS Conference on Big Data (Cham: Springer), 333–345. doi: 10.1007/978-3-319-47898-2_34

Demestichas, K., and Daskalakis, E. (2020). Data lifecycle management in precision agriculture supported by information and communication technology. Agronomy 10:1648. doi: 10.3390/agronomy10111648

Dhyani, S., Bartlett, D., Kadaverugu, R., Dasgupta, R., Pujari, P., and Verma, P. (2020). Integrated climate sensitive restoration framework for transformative changes to sustainable land restoration. Restor. Ecol . 28, 1026–1031. doi: 10.1111/rec.13230

Di Gregorio, M., Fatorelli, L., Paavola, J., Locatelli, B., Pramova, E., Nurrochmat, D. R., et al. (2019). Multi-level governance and power in climate change policy networks. Glob. Environ. Change 54, 64–77. doi: 10.1016/j.gloenvcha.2018.10.003

Dörgö, G., Sebestyén, V., and Abonyi, J. (2018). Evaluating the interconnectedness of the sustainable development goals based on the causality analysis of sustainability indicators. Sustainability 10:3766. doi: 10.3390/su10103766

Du, X., Shrestha, N. K., and Wang, J. (2019). Assessing climate change impacts on stream temperature in the athabasca river basin using swat equilibrium temperature model and its potential impacts on stream ecosystem. Sci. Tot. Environ . 650, 1872–1881. doi: 10.1016/j.scitotenv.2018.09.344

Dubey, R., Gunasekaran, A., Childe, S. J., Papadopoulos, T., Luo, Z., Wamba, S. F., et al. (2019). Can big data and predictive analytics improve social and environmental sustainability? Technol. Forecast. Soc. Change 144, 534–545. doi: 10.1016/j.techfore.2017.06.020

Erl, T., Khattak, W., and Buhler, P. (2016). Big data Fundamentals: Concepts, Drivers & Techniques . Upper Saddle River, NJ: Prentice Hall Press.

Eyring, V., Bony, S., Meehl, G. A., Senior, C. A., Stevens, B., Stouffer, R. J., et al. (2016). Overview of the coupled model intercomparison project phase 6 (CMIP6) experimental design and organization. Geosci. Model Dev . 9, 1937–1958. doi: 10.5194/gmd-9-1937-2016

Faghmous, J. H., and Kumar, V. (2014). A big data guide to understanding climate change: the case for theory-guided data science. Big Data 2, 155–163. doi: 10.1089/big.2014.0026

Fan, C., and Mostafavi, A. (2019). Metanetwork framework for performance analysis of disaster management system-of-systems. IEEE Syst. J . 14, 1265–1276. doi: 10.1109/JSYST.2019.2926375

Fathi, S., and Srinivasan, R. (2019). “Climate change impacts on campus buildings energy use: an AI-based scenario analysis,” in Proceedings of the 1st ACM International Workshop on Urban Building Energy Sensing, Controls, Big Data Analysis, and Visualization (New York, NY), 112–119. doi: 10.1145/3363459.3363540

Fayyad, U., and Simoudis, E. (1997). “Data mining and knowledge discovery. Tutorial notes at pad'97-1st int,” in Conf. Prac. App. KDD & Data Mining (London). doi: 10.1023/A:1009715820935

Fenu, G., and Malloci, F. M. (2019). “An application of machine learning technique in forecasting crop disease,” in Proceedings of the 2019 3rd International Conference on Big Data Research (New York, NY), 76–82. doi: 10.1145/3372454.3372474

Fiore, S., Elia, D., Palazzo, C. A., D'Anca Antonio, F., Williams, D. N., et al. (2018). “Towards an open (data) science analytics-hub for reproducible multi-model climate analysis at scale,” in 2018 IEEE International Conference on Big Data (Big Data) (Seattle, WA), 3226–3234. doi: 10.1109/BigData.2018.8622205

Fischer, G., Shah, M., Tubiello, F. N., and Van Velhuizen, H. (2005). Socio-economic and climate change impacts on agriculture: an integrated assessment, 1990-2080. Philos. Trans. R. Soc. B Biol. Sci . 360, 2067–2083. doi: 10.1098/rstb.2005.1744

Fiske, S., Hubacek, K., Jorgenson, A., Li, J., McGovern, T., Rick, T., et al (2018). Drivers and Responses: Social Science Perspectives on Climate Change, Part 2 . Washington, DC: USGCRP Social Science Coordinating Committee. Available online at: https://www.globalchange.gov/content/social-science-perspectives-climate-change-workshop

Fitzharris, B. (2016). Reflections on climate and water over 50 years. Austr. J. Water Resour . 20, 93–107. doi: 10.1080/13241583.2017.1348888

Flato, G., Marotzke, J., Abiodun, B., Braconnot, P., Chou, S. C., Collins, W., et al. (2014). “Evaluation of climate models, in: Climate change 2013: the physical science basis,” in Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change (New York, NY: Cambridge University Press), 741–866.

Foley, A. M. (2018). Climate impact assessment and “Islandness”: challenges and opportunities of knowledge production and decision-making for small island developing states. Int. J. Clim. Change Strat. Manage . 10, 289–302. doi: 10.1108/IJCCSM-06-2017-0142

Ford, J. D., Tilleard, S. E., Berrang-Ford, L., Araos, M., Biesbroek, R., Lesnikowski, A. C., et al. (2016). Opinion: Big data has big potential for applications to climate change adaptation. Proc. Natl. Acad. Sci. U.S.A . 113, 10729–10732. doi: 10.1073/pnas.1614023113

Franco, C., Hepburn, L. A., Smith, D. J., Nimrod, S., and Tucker, A. (2016). A Bayesian belief network to assess rate of changes in coral reef ecosystems. Environ. Model. Softw . 80, 132–142. doi: 10.1016/j.envsoft.2016.02.029

Friedl, M. A., and Brodley, C. E. (1997). Decision tree classification of land cover from remotely sensed data. Remote Sens. Environ . 61, 399–409. doi: 10.1016/S0034-4257(97)00049-7

Füssel, H.-M., and Klein, R. J. (2006). Climate change vulnerability assessments: an evolution of conceptual thinking. Climat. Change 75, 301–329. doi: 10.1007/s10584-006-0329-3

Gandhi, N., and Armstrong, L. J. (2016). “Assessing impact of seasonal rainfall on rice crop yield of Rajasthan, India using association rule mining,” in 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI) (Jaipur), 1021–1024. doi: 10.1109/ICACCI.2016.7732178

Gao, M., Yang, Y., Shi, H., and Gao, Z. (2019). Som-based synoptic analysis of atmospheric circulation patterns and temperature anomalies in China. Atmos. Res . 220, 46–56. doi: 10.1016/j.atmosres.2019.01.005

Garcia, M. E. J. N., and Hernandez, A. A. (2017). “Pattern analysis of natural disasters in the Philippines,” in International Conference on Big Data Technologies and Applications (Cham: Springer), 74–83. doi: 10.1007/978-3-319-98752-1_9

Giest, S. (2017). Big data analytics for mitigating carbon emissions in smart cities: opportunities and challenges. Eur. Plann. Stud . 25, 941–957. doi: 10.1080/09654313.2017.1294149

Gijzen, H. (2013). Big data for a sustainable future. Nature 502:38. doi: 10.1038/502038d

Gomez-Zavaglia, A., Mejuto, J., and Simal-Gandara, J. (2020). Mitigation of emerging implications of climate change on food production systems. Food Res. Int . 2020:109256. doi: 10.1016/j.foodres.2020.109256

Gondchawar, N., and Kawitkar, R. (2016). IOT based smart agriculture. Int. J. Adv. Res. Comput. Commun. Eng . 5, 838–842. doi: 10.17148/IJARCCE.2016.56188

Gouveia, J. P., and Palma, P. (2019). Harvesting big data from residential building energy performance certificates: retrofitting and climate change mitigation insights at a regional scale. Environ. Res. Lett . 14:095007. doi: 10.1088/1748-9326/ab3781

Goyal, M. K., and Sharma, A. (2016). A fuzzy c-means approach regionalization for analysis of meteorological drought homogeneous regions in western India. Nat. Hazards 84, 1831–1847. doi: 10.1007/s11069-016-2520-9

Gulzar, M., Abbas, G., and Waqas, M. (2020). “Climate smart agriculture: a survey and taxonomy,” in 2020 International Conference on Emerging Trends in Smart Technologies (ICETST) , 1–6. doi: 10.1109/ICETST49965.2020.9080695

Guo, H.-D., Zhang, L., and Zhu, L.-W. (2015). Earth observation big data for climate change research. Adv. Clim. Change Res . 6, 108–117. doi: 10.1016/j.accre.2015.09.007

Gurram, S., Sivaraman, V., Apple, J. T., and Pinjari, A. R. (2019). “Agent-based modeling to simulate road travel using big data from smartphone GPS: an application to the continental united states,” in 2019 IEEE International Conference on Big Data (Big Data) (Los Angeles, CA), 3553–3562. doi: 10.1109/BigData47090.2019.9006339

Gutierrez, B. T., Plant, N. G., and Thieler, E. R. (2011). A Bayesian network to predict coastal vulnerability to sea level rise. J. Geophys. Res . 116, 1–15. doi: 10.1029/2010JF001891

Hämäläinen, E., and Inkinen, T. (2019). Big data in emission producing manufacturing industries-an explorative literature review. ISPRS Ann. Photogr. Remote Sens. Spat. Inform. Sci . 4, 57–64. doi: 10.5194/isprs-annals-IV-4-W9-57-2019

Hamrani, A., Akbarzadeh, A., and Madramootoo, C. A. (2020). Machine learning for predicting greenhouse gas emissions from agricultural soils. Sci. Tot. Environ . 2020:140338. doi: 10.1016/j.scitotenv.2020.140338

Hassani, H., Huang, X., and Silva, E. (2019). Big data and climate change. Big Data Cogn. Comput . 3, 1–17. doi: 10.3390/bdcc3010012

Hazen, B. T., Skipper, J. B., Ezell, J. D., and Boone, C. A. (2016). Big data and predictive analytics for supply chain sustainability: a theory-driven research agenda. Comput. Industr. Eng . 101, 592–598. doi: 10.1016/j.cie.2016.06.030

He, Q., Guo, X., Li, D., Jin, Y., Zhang, L., and Zhang, R. (2020). “Research on the selection method of fy-3d/mwhts clear sky observation data based on neural network,” in Journal of Physics: Conference Series, Vol. 1656 (Qingdao: IOP Publishing), 012007. doi: 10.1088/1742-6596/1656/1/012007. Available online at: https://iopscience.iop.org/issue/1742-6596/1656/1

Hegerl, G. C., Brönnimann, S., Cowan, T., Friedman, A. R., Hawkins, E., Iles, C., et al. (2019). Causes of climate change over the historical record. Environ. Res. Lett . 14:123006. doi: 10.1088/1748-9326/ab4557

Heumann, B. W. (2011). An object-based classification of mangroves using a hybrid decision tree-support vector machine approach. Remote Sens . 3, 2440–2460. doi: 10.3390/rs3112440

Hong, J., Hong, T., Kang, H., and Lee, M. (2019). A framework for reducing dust emissions and energy consumption on construction sites. Energy Proc . 158, 5092–5096. doi: 10.1016/j.egypro.2019.01.637

Honti, G. M., and Abonyi, J. (2019). A review of semantic sensor technologies in internet of things architectures. Complexity 2019:6473160. doi: 10.1155/2019/6473160

Horcea-Milcu, A.-I., Martín-López, B., Lam, D. P., and Lang, D. J. (2020). Research pathways to foster transformation: linking sustainability science and social-ecological systems research. Ecol. Soc . 25, 1–29. doi: 10.5751/ES-11332-250113

Hou, D., Bolan, N. S., Tsang, D. C., Kirkham, M. B., and O'Connor, D. (2020). Sustainable soil use and management: an interdisciplinary and systematic approach. Sci. Tot. Environ . 2020:138961. doi: 10.1016/j.scitotenv.2020.138961

Howard, S., Howard, S., and Howard, S. (2020). Quantitative market analysis of the European climate services sector-the application of the kmatrix big data market analytical tool to provide robust market intelligence. Climate Serv . 17:100108. doi: 10.1016/j.cliser.2019.100108

Hsu, A., Höhne, N., Kuramochi, T., Roelfsema, M., Weinfurter, A., Xie, Y., et al. (2019). A research roadmap for quantifying non-state and subnational climate mitigation action. Nat. Clim. Change 9, 11–17. doi: 10.1038/s41558-018-0338-z

Hu, L. Q., Yadav, A., Khan, A., Liu, H., and Ul Haq, A. (2020). Application of big data fusion based on cloud storage in green transportation: an application of healthcare. Sci. Program. 2020:1593946. doi: 10.1155/2020/1593946

Hu, M., and Pavao-Zuckerman, M. (2019). Literature review of net zero and resilience research of the urban environment: a citation analysis using big data. Energies 12:1539. doi: 10.3390/en12081539

Hu, S., Niu, Z., and Chen, Y. (2017). Global wetland datasets: a review. Wetlands 37, 807–817. doi: 10.1007/s13157-017-0927-z

Hu, W., Li, C.-H., Ye, C., Wang, J., Wei, W.-W., and Deng, Y. (2019). Research progress on ecological models in the field of water eutrophication: Citespace analysis based on data from the ISI web of science database. Ecol. Model . 410:108779. doi: 10.1016/j.ecolmodel.2019.108779

Huang, J. (2015). Venture Capital Investment and Trend in Clean Technologies . New York, NY: Springer. doi: 10.1007/978-3-319-14409-2_11

Iacobuta, G., Dubash, N. K., Upadhyaya, P., Deribe, M., and Höhne, N. (2018). National climate change mitigation legislation, strategy and targets: a global update. Clim. Policy 18, 1114–1132. doi: 10.1080/14693062.2018.1489772

Inderwildi, O., Zhang, C., Wang, X., and Kraft, M. (2020). The impact of intelligent cyber-physical systems on the decarbonization of energy. Energy Environ. Sci . 13, 744–771. doi: 10.1039/C9EE01919G

Inukollu, V. N., Arsi, S., and Ravuri, S. R. (2014). Security issues associated with big data in cloud computing. Int. J. Netw. Secur. Appl . 6, 45–55. doi: 10.5121/ijnsa.2014.6304

Ise, T., and Oba, Y. (2020). Varenn: graphical representation of periodic data and application to climate studies. NPJ Clim. Atmos. Sci . 3, 1–6. doi: 10.1038/s41612-020-0129-x

Ishwarappa and Anuradha, J. (2015). A brief introduction on big data 5vs characteristics and hadoop technology. Proc. Comput. Sci . 48, 319–324. doi: 10.1016/j.procs.2015.04.188

Ismail, H., Kamal, M. R., bin Abdullah, A. F., and bin Mohd, M. S. F. (2020). Climate-smart agro-hydrological model for a large scale rice irrigation scheme in Malaysia. Appl. Sci . 10:3906. doi: 10.3390/app10113906

Iturriza, M., Labaka, L., Ormazabal, M., and Borges, M. (2020). Awareness-development in the context of climate change resilience. Urban Climate 32:100613. doi: 10.1016/j.uclim.2020.100613

Iwanaga, T., Wang, H.-H., Hamilton, S. H., Grimm, V., Koralewski, T. E., Salado, A., et al. (2020). Socio-technical scales in socio-environmental modeling: managing a system-of-systems modeling approach. Environ. Model. Softw . 135:104885. doi: 10.1016/j.envsoft.2020.104885

Jabbour, C. J. C., de Sousa Jabbour, A. B. L., Sarkis, J., and Godinho Filho, M. (2019). Unlocking the circular economy through new business models based on large-scale data: an integrative framework and research agenda. Technol. Forecast. Soc. Change 144, 546–552. doi: 10.1016/j.techfore.2017.09.010

Jang, H. S., Bae, K. Y., Park, H.-S., and Sung, D. K. (2016). Solar power prediction based on satellite images and support vector machine. IEEE Trans. Sustain. Energy 7, 1255–1263. doi: 10.1109/TSTE.2016.2535466

Jang, S. M., and Hart, P. S. (2015). Polarized frames on “climate change” and “global warming”? across countries and states: evidence from twitter big data. Glob. Environ. Change 32, 11–17. doi: 10.1016/j.gloenvcha.2015.02.010

Jato-Espino, D., Sillanpää, N., Andrés-Doménech, I., and Rodriguez-Hernandez, J. (2018). Flood risk assessment in urban catchments using multiple regression analysis. J. Water Resour. Plann. Manage . 144:04017085. doi: 10.1061/(ASCE)WR.1943-5452.0000874

Jimenez, S., Aviles, A., Galán, L., Flores, A., Matovelle, C., and Vintimilla, C. (2019). “Support vector regression to downscaling climate big data: an application for precipitation and temperature future projection assessment,” in Conference on Information Technologies and Communication of Ecuador (Cham: Springer), 182–193. doi: 10.1007/978-3-030-35740-5_13

Joshi, A., Pebesma, E., Henriques, R., and Appel, M. (2019). “SCIDB based framework for storage and analysis of remote sensing big data,” International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences (Dhulikhel), 42. doi: 10.5194/isprs-archives-XLII-5-W3-43-2019

Jung, J., Maeda, M., Chang, A., Bhandari, M., Ashapure, A., and Landivar-Bowles, J. (2020). The potential of remote sensing and artificial intelligence as tools to improve the resilience of agriculture production systems. Curr. Opin. Biotechnol . 70, 15–22. doi: 10.1016/j.copbio.2020.09.003

Kadow, C., Hall, D. M., and Ulbrich, U. (2020). Artificial intelligence reconstructs missing climate information. Nat. Geosci . 13, 408–413. doi: 10.1038/s41561-020-0582-5

Kates, R. W., Clark, W. C., Corell, R., Hall, J. M., Jaeger, C. C., Lowe, I., et al. (2001). Sustainability science. Science 292, 641–642. doi: 10.1126/science.1059386

Keliang, H. (2019). “Impacts of climate change on hydrological cycle in the Yangtze River Basin Based on regression analysis,” in 2019 International Conference on Civil Engineering, Materials and Environment (ICCEME 2019) (Changchun).

Klenk, N., and Meehan, K. (2015). Climate change and transdisciplinary science: problematizing the integration imperative. Environ. Sci. Policy 54, 160–167. doi: 10.1016/j.envsci.2015.05.017

Klimarechenzentrum, D. (2021). Climate Sciences and Supercomputers , Available online at: https://www.dkrz.de/about-en/aufgaben/hpc [accessed December 2, 2021).

Knutti, R., Stocker, T., Joos, F., and Plattner, G.-K. (2003). Probabilistic climate change projections using neural networks. Clim. Dyn . 21, 257–272. doi: 10.1007/s00382-003-0345-1

Komorowski, M., Marshall, D. C., Salciccioli, J. D., and Crutain, Y. (2016). “Exploratory data analysis,” in Secondary Analysis of Electronic Health Records. Cham: Springer. doi: 10.1007/978-3-319-43742-2_15

Kouloukoui, D., de Oliveira Marinho, M. M., da Silva Gomes, S. M., Kiperstok, A., and Torres, E. A. (2019). Corporate climate risk management and the implementation of climate projects by the world's largest emitters. J. Clean. Prod . 238:117935. doi: 10.1016/j.jclepro.2019.117935

Kubo, T., Uryu, S., Yamano, H., Tsuge, T., Yamakita, T., and Shirayama, Y. (2020). Mobile phone network data reveal nationwide economic value of coastal tourism under climate change. Tour. Manage . 77:104010. doi: 10.1016/j.tourman.2019.104010

Lake, I., and Barker, G. (2018). Climate change, foodborne pathogens and illness in higher-income countries. Curr. Environ. Health Rep . 5, 187–196. doi: 10.1007/s40572-018-0189-9

Lambrinos, L. (2019). “Internet of things in agriculture: a decision support system for precision farming,” in 2019 IEEE International Conference on Dependable, Autonomic and Secure Computing, International Conference on Pervasive Intelligence and Computing, International Conference on Cloud and Big Data Computing, International Conference on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech) (Fukuoka), 889–892. doi: 10.1109/DASC/PiCom/CBDCom/CyberSciTech.2019.00163

Laney, D. (2001). 3d Data Management: Controlling Data Volume, Velocity and Variety . META Group.

Lasso, E., and Corrales, J. C. (2017). “Towards an alert system for coffee diseases and pests in a smart farming approach based on semi-supervised learning and graph similarity,” in International Conference of ICT for Adapting Agriculture to Climate Change (Cham: Springer), 111–123. doi: 10.1007/978-3-319-70187-5_9

Lavin, A., and Klabjan, D. (2015). Clustering time-series energy data from smart meters. Energy Efficien . 8, 681–689. doi: 10.1007/s12053-014-9316-0

Leitold, D., V́athy-Fogarassy, A., and Abonyi, J. (2020). Network-Based Analysis of Dynamical Systems: Methods for Controllability and Observability Analysis, and Optimal Sensor Placement . Cham: Springer Nature. doi: 10.1007/978-3-030-36472-4

Lenton, T. M. (2011). Early warning of climate tipping points. Nat. Climate Change 1, 201–209. doi: 10.1038/nclimate1143

Li, Z., Bagan, H., and Yamagata, Y. (2018). Analysis of spatiotemporal land cover changes in inner Mongolia using self-organizing map neural network and grid cells method. Sci. Tot. Environ . 636, 1180–1191. doi: 10.1016/j.scitotenv.2018.04.361

Li, Z., Huang, Q., Carbone, G. J., and Hu, F. (2017). A high performance query analytical framework for supporting data-intensive climate studies. Comput. Environ. Urban Syst . 62, 210–221. doi: 10.1016/j.compenvurbsys.2016.12.003

Li, Z., Yang, C., Sun, M., Li, J., Xu, C., Huang, Q., et al. (2013). “A high performance web-based system for analyzing and visualizing spatiotemporal data for climate studies,” in International Symposium on Web and Wireless Geographical Information Systems (Berlin; Heidelberg: Springer), 190–198. doi: 10.1007/978-3-642-37087-8_14

Li, Z., Zhou, W., Liu, X., Qian, Y., Wang, C., Xie, Z., et al. (2019). “Research on association rules mining of atmospheric environment monitoring data,” in National Conference on Computer Science Technology and Education (Singapore: Springer), 86–98. doi: 10.1007/978-981-15-5390-5_8

Little, J. C., Hester, E. T., Elsawah, S., Filz, G. M., Sandu, A., Carey, C. C., et al. (2019). A tiered, system-of-systems modeling framework for resolving complex socio-environmental policy issues. Environ. Model. Softw . 112, 82–94. doi: 10.1016/j.envsoft.2018.11.011

Lozano, S., Calzada-Infante, L., Adenso-Díaz, B., and García, S. (2019). Complex network analysis of keywords co-occurrence in the recent efficiency analysis literature. Scientometrics 120, 609–629. doi: 10.1007/s11192-019-03132-w

Mabrouki, J., Azrour, M., Dhiba, D., Farhaoui, Y., and El Hajjaji, S. (2021). IOT-based data logger for weather monitoring using arduino-based wireless sensor networks with remote graphical application and alerts. Big Data Min. Anal . 4, 25–32. doi: 10.26599/BDMA.2020.9020018

Maheshwari, B., Pinto, U., Akbar, S., and Fahey, P. (2020). Is urbanisation also the culprit of climate change? Evidence from Australian cities. Urban Clim . 31:100581. doi: 10.1016/j.uclim.2020.100581

Majidi, B., Hemmati, O., Baniardalan, F., Farahmand, H., Hajitabar, A., Sharafi, S., et al. (2021) “Geo-spatiotemporal intelligence for smart agricultural environmental eco-cyber-physical systems,” in Enabling AI Applications in Data Science. Studies in Computational Intelligence , Vol. 911, eds A. E. Hassanien, M. H. N. Taha, N. E. M. Khalifa (Cham: Springer). doi: 10.1007/978-3-030-52067-0_21

Mallick, R. B., Jacobs, J. M., Miller, B. J., Daniel, J. S., and Kirshen, P. (2018). Understanding the impact of climate change on pavements with cmip5, system dynamics and simulation. Int. J. Pave. Eng . 19, 697–705. doi: 10.1080/10298436.2016.1199880

Manogaran, G., and Lopez, D. (2018). Spatial cumulative sum algorithm with big data analytics for climate change detection. Comput. Electr. Eng . 65, 207–221. doi: 10.1016/j.compeleceng.2017.04.006

Marcu, I., Voicu, C., Drǎgulinescu, A. M. C., Fratu, O., Suciu, G., Balaceanu, C., et al. (2019). “Overview of IoT basic platforms for precision agriculture,” in International Conference on Future Access Enablers of Ubiquitous and Intelligent Infrastructures (Cham: Springer), 124–137. doi: 10.1007/978-3-030-23976-3_13

Maria, R. E., Junior, L. A. R., de Vasconcelos, L. E. G., Pinto, A. F. M., Tsoucamoto, P. T., Silva, H. N. A., et al. (2015). “Applying scrum in an interdisciplinary project using big data, internet of things, and credit cards,” in 2015 12th International Conference on Information Technology-New Generations (Las Vegas, NV), 67–72. doi: 10.1109/ITNG.2015.17

Mathivanan, S., and Jayagopal, P. (2019). A big data virtualization role in agriculture: a comprehensive review. Walailak J. Sci. Technol . 16, 55–70. doi: 10.48048/wjst.2019.3620

Meadows, D., Meadows, D., Randers, J., and Behrens, W. (1972). The Limits to Growth: A Report for the Club of Rome's Project on the Predicament of Mankind . New York, NY: New American Library. doi: 10.1349/ddlp.1

Meineke, E. K., Tomasi, C., Yuan, S., and Pryer, K. M. (2020). Applying machine learning to investigate long-term insect-plant interactions preserved on digitized herbarium specimens. Appl. Plant Sci . 8:e11369. doi: 10.1002/aps3.11369

Meraner, A., Ebel, P., Zhu, X. X., and Schmitt, M. (2020). Cloud removal in sentinel-2 imagery using a deep residual neural network and SAR-optical data fusion. ISPRS J. Photogr. Remote Sens . 166, 333–346. doi: 10.1016/j.isprsjprs.2020.05.013

Milojevic-Dupont, N., and Creutzig, F. (2020). Machine learning for geographically differentiated climate change mitigation in urban areas. Sustain. Cities Soc . 2020:102526. doi: 10.1016/j.scs.2020.102526

Mourtzios, C., Kourtesis, D., Papadimitriou, N., Antzoulatos, G., Kouloglou, I. O., Vrochidis, S., et al. (2021). “Work-in-progress: smart-water, a Novel Telemetry and remote control system infrastructure for the management of water consumption in Thessaloniki,” in Internet of Things, Infrastructures and Mobile Applications. IMCL 2019. Advances in Intelligent Systems and Computing, Vol. 1192, eds M. E. Auer, and T. Tsiatsos (Cham: Springer). doi: 10.1007/978-3-030-49932-7_89

Nguyen, T. B., Wagner, F., and Schoepp, W. (2012). “EC4MACS-an integrated assessment toolbox of well-established modeling tools to explore the synergies and interactions between climate change, air quality and other policy objectives,” in International Conference on Information and Communication on Technology (Berlin; Heidelberg: Springer), 94–108. doi: 10.1007/978-3-642-32606-6_8

Nobre, G. C., and Tavares, E. (2017). Scientific literature analysis on big data and internet of things applications on circular economy: a bibliometric study. Scientometrics 111, 463–492. doi: 10.1007/s11192-017-2281-6

Olaya-Abril, A., Parras-Alcántara, L., Lozano-García, B., and Obregón-Romero, R. (2017). Soil organic carbon distribution in mediterranean areas under a climate change scenario via multiple linear regression analysis. Sci. Tot. Environ . 592, 134–143. doi: 10.1016/j.scitotenv.2017.03.021

Pachepsky, Y., Rajkai, K., and Tóth, B. (2015). Pedotransfer in soil physics: trends and outlook–a review. Agrokémia és Talajtan 64, 339–360. doi: 10.1556/0088.2015.64.2.3

Park, S.-T., Kim, D.-Y., and Li, G. (2020). An analysis of environmental big data through the establishment of emotional classification system model based on machine learning: focus on multimedia contents for portal applications. Multimed. Tools Appl . 1–19. doi: 10.1007/s11042-020-08818-5

Pauliuk, S. (2020). Making sustainability science a cumulative effort. Nat. Sustainabil . 3, 2–4. doi: 10.1038/s41893-019-0443-7

Peterson, A. T., Ortega-Huerta, M. A., Bartley, J., Sánchez-Cordero, V., Soberón, J., Buddemeier, R., et al. (2002). Future projections for Mexican faunas under global climate change scenarios. Nature 416, 626–629. doi: 10.1038/416626a

Pileggi, S. F., and Lamia, S. A. (2020). Climate change timeline: an ontology to tell the story so far. IEEE Access 8, 65294–65312. doi: 10.1109/ACCESS.2020.2985112

Poff, N. L., Tokar, S., and Johnson, P. (1996). Stream hydrological and ecological responses to climate change assessed with an artificial neural network. Limnol. Oceanogr . 41, 857–863. doi: 10.4319/lo.1996.41.5.0857

Qin, Y., and Chi, M. (2020). Rsimagenet: a universal deep semantic segmentation lifecycle for remote sensing images. IEEE Access 8, 68254–68267. doi: 10.1109/ACCESS.2020.2986514

Radhika, T., Gouda, K. C., and Kumar, S. S. (2016). “Big data research in climate science,” in 2016 International Conference on Communication and Electronics Systems (ICCES) (Coimbatore), 1–6. doi: 10.1109/CESYS.2016.7889855

Rahmati, O., Darabi, H., Haghighi, A. T., Stefanidis, S., Kornejady, A., Nalivan, O. A., et al. (2019). Urban flood hazard modeling using self-organizing map neural network. Water 11:2370. doi: 10.3390/w11112370

Rajaraman, V. (2016). Big data analytics. Resonance 21, 695–716. doi: 10.1007/s12045-016-0376-7

Rao, N. (2018). Big data and climate smart agriculture-status and implications for agricultural research and innovation in India. Proc. Indian Natl. Sci. Acad . 84, 625–640. doi: 10.16943/ptinsa/2018/49342

Rashid, R. A., Nohuddin, P. N., and Zainol, Z. (2017). “Association rule mining using time series data for Malaysia climate variability prediction,” in International Visual Informatics Conference (Cham: Springer), 120–130. doi: 10.1007/978-3-319-70010-6_12

Raut, R. D., Mangla, S. K., Narwane, V. S., Gardas, B. B., Priyadarshinee, P., and Narkhede, B. E. (2019). Linking big data analytics and operational sustainability practices for sustainable business management. J. Clean. Prod . 224, 10–24. doi: 10.1016/j.jclepro.2019.03.181

Rees, E., Ng, V., Gachon, P., Mawudeku, A., McKenney, D., Pedlar, J., et al. (2019). Risk assessment strategies for early detection and prediction of infectious disease outbreaks associated with climate change. Can. Commun. Dis. Rep . 45, 119–126. doi: 10.14745/ccdr.v45i05a02

Rockström, J., Steffen, W., Noone, K., Persson, A., Chapin, F. S. III., Lambin, E., et al. (2009). Planetary boundaries: exploring the safe operating space for humanity. Ecol. Soc . 14, 1–33. doi: 10.5751/ES-03180-140232

Rogelj, J., Den Elzen, M., Höhne, N., Fransen, T., Fekete, H., Winkler, H., et al. (2016). Paris agreement climate proposals need a boost to keep warming well below 2 c, Nature 534, 631–639. doi: 10.1038/nature18307

Rolnick, D., Donti, P. L., Kaack, L. H., Kochanski, K., Lacoste, A., Sankaran, K., et al. (2019). Tackling climate change with machine learning. arXiv [Preprint]. arXiv:1906.05433.

Ross, S. A., and Cheah, L. (2019). Uncertainty quantification in life cycle assessments: exploring distribution choice and greater data granularity to characterize product use. J. Indus. Ecol . 23, 335–346. doi: 10.1111/jiec.12742

Sarker, M. N. I., Yang, B., Lv, Y., Huq, M. E., and Kamruzzaman, M. (2020). Climate change adaptation and resilience through big data. Sci. Inform. Organ . 11, 533–539. doi: 10.14569/IJACSA.2020.0110368

Sasaki, S., Kiyoki, Y., Sarkar-Swaisgood, M., Wijitdechakul, J., Rachmawan, I. E. W., Srivastava, S., et al. (2020). 5d world map system for disaster-resilience monitoring from global to local: environmental AI system for leading SDG 9 and 11. Inform. Model. Knowl. Bases 321:306.

Schnase, J. L., Lee, T. J., Mattmann, C. A., Lynnes, C. S., Cinquini, L., Ramirez, P. M., et al. (2016). Big data challenges in climate science: improving the next-generation cyberinfrastructure. IEEE Geosci. Remote Sens. Mag . 4, 10–22. doi: 10.1109/MGRS.2015.2514192

Scholze, M., Knorr, W., Arnell, N. W., and Prentice, I. C. (2006). A climate-change risk analysis for world ecosystems. Proc. Natl. Acad. Sci. U.S.A . 103, 13116–13120. doi: 10.1073/pnas.0601816103

Sebestyén, V., Bulla, M., Rédey, Á., and Abonyi, J. (2019). Network model-based analysis of the goals, targets and indicators of sustainable development for strategic environmental assessment. J. Environ. Manage . 238, 126–135. doi: 10.1016/j.jenvman.2019.02.096

Seles, B. M. R. P., de Sousa Jabbour, A. B. L., Jabbour, C. J. C., de Camargo Fiorini, P., Mohd-Yusoff, Y., and Thomé, A. M. T. (2018). Business opportunities and challenges as the two sides of the climate change: corporate responses and potential implications for big data management towards a low carbon society. J. Clean. Prod . 189, 763–774. doi: 10.1016/j.jclepro.2018.04.113

Semlali, B.-E. B., and El Amrani, C. (2021). Big data and remote sensing: a new software of ingestion. Int. J. Electr. Comput. Eng . 11, 1521–1530. doi: 10.11591/ijece.v11i2.pp1521-1530

Semlali, B.-E. B., El Amrani, C., and Ortiz, G. (2020). Sat-ETL-integrator: an extract-transform-load software for satellite big data ingestion. J. Appl. Remote Sens . 14:018501. doi: 10.1117/1.JRS.14.018501

Senay, G. B., Schauer, M., Friedrichs, M., Velpuri, N. M., and Singh, R. K. (2017). Satellite-based water use dynamics using historical landsat data (1984-2014) in the southwestern United States. Remote Sens. Environ . 202, 98–112. doi: 10.1016/j.rse.2017.05.005

Seyedzadeh, S., Rahimian, F. P., Glesk, I., and Roper, M. (2018). Machine learning for estimation of building energy consumption and performance: a review. Visual. Eng . 6:5. doi: 10.1186/s40327-018-0064-7

Sharif, M., and Burn, D. H. (2006). Simulating climate change scenarios using an improved k-nearest neighbor model. J. Hydrol . 325, 179–196. doi: 10.1016/j.jhydrol.2005.10.015

Sharifi, A. (2019). A critical review of selected smart city assessment tools and indicator sets. J. Clean. Prod . 233, 1269–1283. doi: 10.1016/j.jclepro.2019.06.172

Shirkhorshidi, A. S., Aghabozorgi, S., Wah, T. Y., and Herawan, T. (2014). “Big data clustering: a review,” in International Conference on Computational Science and Its Applications (Cham: Springer), 707–720. doi: 10.1007/978-3-319-09156-3_49

Song, M., Cen, L., Zheng, Z., Fisher, R., Liang, X., Wang, Y., et al. (2017). How would big data support societal development and environmental sustainability? Insights and practices. J. Clean. Product . 142, 489–500. doi: 10.1016/j.jclepro.2016.10.091

Steffen, W., Richardson, K., Rockström, J., Cornell, S. E., Fetzer, I., Bennett, E. M., et al. (2015). Planetary boundaries: guiding human development on a changing planet. Science 347:6223. doi: 10.1126/science.1259855

Stengel, K., Glaws, A., and King, R. (2019). Physics-informed super resolution of climatological wind and solar resource data. AGUFM 2019:A43E-04. Available online at: https://ui.adsabs.harvard.edu/abs/2019AGUFM.A43E.04S/abstract

Suchetha, K., and Guruprasad, H. (2015). Integration of iot, cloud and big data. Glob. J. Eng. Sci. Res . 2, 251–258. Available online at: http://www.gjesr.com/Issues%20PDF/Archive-2015/July-2015/34.pdf

Sullivan, C. A., and Huntingford, C. (2009). “Water resources, climate change and human vulnerability,” in 18th World IMACS/MODSIM Congress (Cairns), 1–8.

Tang, J., Körner, C., Muraoka, H., Piao, S., Shen, M., Thackeray, S. J., et al. (2016). Emerging opportunities and challenges in phenology: a review. Ecosphere 7, 1–17. doi: 10.1002/ecs2.1436

Tannahill, B. K., and Jamshidi, M. (2014). System of systems and big data analytics-bridging the gap. Comput. Electr. Eng . 40, 2–15. doi: 10.1016/j.compeleceng.2013.11.016

Taranto, F., Nicolia, A., Pavan, S., De Vita, P., and D'Agostino, N. (2018). Biotechnological and digital revolution for climate-smart plant breeding. Agronomy 8, 2–20. doi: 10.3390/agronomy8120277

Taylor, K. E., Stouffer, R. J., and Meehl, G. A. (2012). An overview of cmip5 and the experiment design. Bull. Am. Meteorol. Soc . 93, 485–498. doi: 10.1175/BAMS-D-11-00094.1

Teng, S., Khong, K. W., and Ha, N. C. (2020). Palm oil and its environmental impacts: a big data analytics study. J. Clean. Prod . 274:122901. doi: 10.1016/j.jclepro.2020.122901

Teng, S.-Y., Ou, C.-K., and Chuang, K.-T. (2019). On the discovery of spatial-temporal fluctuating patterns. Int. J. Data Sci. Analyt . 8, 57–75. doi: 10.1007/s41060-018-0159-1

Toujani, A., Achour, H., Turki, S. Y., and Faíz, S. (2020). Estimating forest losses using spatio-temporal pattern-based sequence classification approach. Appl. Artif. Intell . 1–25. doi: 10.1080/08839514.2020.1790247

Trifu, M. R., and Ivan, M. L. (2014). Big data: present and future. Database Syst. J . 5, 32–41. Available online at: https://www.dbjournal.ro/archive/15/15_4.pdf

Tripathi, S., Srinivas, V., and Nanjundiah, R. S. (2006). Downscaling of precipitation for climate change scenarios: a support vector machine approach. J. Hydrol . 330, 621–640. doi: 10.1016/j.jhydrol.2006.04.030

Uddin, M. N., Islam, A. S., Bala, S. K., Islam, G. T., Adhikary, S., Saha, D., et al. (2019). Mapping of climate vulnerability of the coastal region of Bangladesh using principal component analysis. Appl. Geogr . 102, 47–57. doi: 10.1016/j.apgeog.2018.12.011

UN (2016). A/RES/70/1Transforming our world: The 2030 agenda for sustainable development. United Nations.

Van der Linden, S. (2015). The social-psychological determinants of climate change risk perceptions: towards a comprehensive model. J. Environ. Psychol . 41, 112–124. doi: 10.1016/j.jenvp.2014.11.012

Venkatasubramanian, V., Rengaswamy, R., and Kavuri, S. N. (2003). A review of process fault detection and diagnosis: Part II: qualitative models and search strategies. Comput. Chem. Eng . 27, 313–326. doi: 10.1016/S0098-1354(02)00161-8

Wang, G., Mang, S., Cai, H., Liu, S., Zhang, Z., Wang, L., et al. (2016). Integrated watershed management: evolution, development and emerging trends. J. For. Res . 27, 967–994. doi: 10.1007/s11676-016-0293-3

Wang, M., Ullrich, P., and Millstein, D. (2020). Future projections of wind patterns in California with the variable-resolution CESM: a clustering analysis approach. Climate Dyn . 54, 2511–2531. doi: 10.1007/s00382-020-05125-5

Wang, Q., Huang, J., Liu, R., Men, C., Guo, L., Miao, Y., et al. (2020). Sequence-based statistical downscaling and its application to hydrologic simulations based on machine learning and big data. J. Hydrol . 2020:124875. doi: 10.1016/j.jhydrol.2020.124875

Wang, Z., Xue, M., Wang, Y., Song, M., Li, S., Daziano, R. A., et al. (2019). Big data: new tend to sustainable consumption research. J. Clean. Prod . 236:117499. doi: 10.1016/j.jclepro.2019.06.330

Wright, C., and Nyberg, D. (2017). An inconvenient truth: how organizations translate climate change into business as usual. Acad. Manage. J . 60, 1633–1661. doi: 10.5465/amj.2015.0718

Wu, W., Simpson, A. R., and Maier, H. R. (2010). Accounting for greenhouse gas emissions in multiobjective genetic algorithm optimization of water distribution systems. J. Water Resour. Plann. Manage . 136, 146–155. doi: 10.1061/(ASCE)WR.1943-5452.0000020

Xie, B., Brewer, M. B., Hayes, B. K., McDonald, R. I., and Newell, B. R. (2019). Predicting climate change risk perception and willingness to act. J. Environ. Psychol . 65:101331. doi: 10.1016/j.jenvp.2019.101331

Xie, H., Zhang, Y., Choi, Y., and Li, F. (2020). A scientometrics review on land ecosystem service research. Sustainability 12:2959. doi: 10.3390/su12072959

Xie, Y., Kou, X., and Li, P. (2019). A simulation method of three-dimensional cloud over WRF big data, EURASIP J. Wireless Commun. Network . 2019, 1–10. doi: 10.1186/s13638-019-1584-0

Xu, G., Shi, Y., Sun, X., and Shen, W. (2019). Internet of things in marine environment monitoring: a review. Sensors 19:1711. doi: 10.3390/s19071711

Yan, Y., Xu, X., Liu, X., Wen, Y., and Ou, J. (2020). Assessing the contributions of climate change and human activities to cropland productivity by means of remote sensing. Int. J. Remote Sens . 41, 2004–2021. doi: 10.1080/01431161.2019.1681603

Yan-e, D. (2011). “Design of intelligent agriculture management information system based on IOT,” in 2011 Fourth International Conference on Intelligent Computation Technology and Automation, Vol. 1 (Shenzhen), 1045–1049. doi: 10.1109/ICICTA.2011.262

Yang, C., Su, G., and Chen, J. (2017). “Using big data to enhance crisis response and disaster resilience for a smart city,” in 2017 IEEE 2nd International Conference on Big Data Analysis (ICBDA) (Beijing), 504–507. doi: 10.1109/ICBDA.2017.8078684

Yang, J., Gong, P., Fu, R., Zhang, M., Chen, J., Liang, S., et al. (2013). The role of satellite remote sensing in climate change studies. Nat. Clim. Change 3, 875–883. doi: 10.1038/nclimate1908

Yuan, M., and Bothwell, J. (2013). “Space-time analytics for spatial dynamics,” in Data Mining: Concepts, Methodologies, Tools, and Applications , ed I. Management Association (Hershey, PA: IGI Global), 2117–2131. doi: 10.4018/978-1-4666-2455-9.ch108

Yusof, N., and Zurita-Milla, R. (2017). Mapping frequent spatio-temporal wind profile patterns using multi-dimensional sequential pattern mining. Int. J. Digit. Earth 10, 238–256. doi: 10.1080/17538947.2016.1217943

Zaki, M. J., and Ho, C.-T. (2000). Large-Scale Parallel Data Mining, No. 1759 . Springer Science & Business Media. doi: 10.1007/3-540-46502-2

Zare, M., and Koch, M. (2018). Groundwater level fluctuations simulation and prediction by anfis-and hybrid wavelet-anfis/fuzzy c-means (Fcm) clustering models: application to the miandarband plain. J. Hydro Environ. Res . 18, 63–76. doi: 10.1016/j.jher.2017.11.004

Zhang, H., Xu, Y., and Kanyerere, T. (2020). A review of the managed aquifer recharge: historical development, current situation and perspectives. Phys. Chem. Earth Parts A/B/C 2020:102887. doi: 10.1016/j.pce.2020.102887

Zhao, J., Yao, L., Huang, Z. C., Zhang, L. C., Liu, Y., and Li, G. Q. (2019). “International reanalysis cooperation on carbon satellites data,” in Proc. SPIE 11152, Remote Sensing of Clouds and the Atmosphere XXIV (Strasbourg) , 111520L. doi: 10.1117/12.2538614

Zheng, F., Tao, R., Maier, H. R., See, L., Savic, D., Zhang, T., et al. (2018). Crowdsourcing methods for data collection in geophysics: state of the art, issues, and future directions. Rev. Geophys . 56, 698–740. doi: 10.1029/2018RG000616

Zheng, Y., Ren, D., Guo, Z., Hu, Z., and Wen, Q. (2019). Research on integrated resource strategic planning based on complex uncertainty simulation with case study of china. Energy 180, 772–786. doi: 10.1016/j.energy.2019.05.120

Keywords: big data, climate change, modeling, systems of systems, data science, climate computing

Citation: Sebestyén V, Czvetkó T and Abonyi J (2021) The Applicability of Big Data in Climate Change Research: The Importance of System of Systems Thinking. Front. Environ. Sci. 9:619092. doi: 10.3389/fenvs.2021.619092

Received: 19 October 2021; Accepted: 24 February 2021; Published: 17 March 2021.

Reviewed by:

Copyright © 2021 Sebestyén, Czvetkó and Abonyi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: János Abonyi, janos@abonyilab.com

This article is part of the Research Topic

Frontiers in Environmental Science – Editor’s Picks 2021

M.Tech/Ph.D Thesis Help in Chandigarh | Thesis Guidance in Chandigarh

thesis topics about big data

[email protected]

thesis topics about big data

+91-9465330425

thesis topics about big data

These days the internet is being widely used than it was used a few years back. It has become a core part of our life. Billions of people are using social media and social networking every day all across the globe. Such a huge number of people generate a flood of data which have become quite complex to manage. Considering this enormous data, a term has been coined to represent it. So, what is this term called? Yes, Big Data Big Data is the term coined to refer to this huge amount of data. The concept of big data is fast spreading its arms all over the world. It is a trending topic for thesis, project, research, and dissertation. There are various good topics for the master’s thesis and research in Big Data and Hadoop as well as for Ph.D. First of all know, what is big data and Hadoop?

Find the link at the end to download the latest thesis and research topics in Big Data

What is Big Data?

Big Data refers to the large volume of data which may be structured or unstructured and which make use of certain new technologies and techniques to handle it. An organized form of data is known as structured data while an unorganized form of data is known as unstructured data. The data sets in big data are so large and complex that we cannot handle them using traditional application software. There are certain frameworks like Hadoop designed for processing big data. These techniques are also used to extract useful insights from data using predictive analysis, user behavior, and analytics. You can explore more on big data introduction while working on the thesis in Big Data. Big Data is defined by three Vs:

Volume – It refers to the amount of data that is generated. The data can be low-density, high volume, structured/unstructured or data with unknown value. This unknown data is converted into useful one using technologies like Hadoop. The data can range from terabytes to petabytes. Velocity – It refers to the rate at which the data is generated. The data is received at an unprecedented speed and is acted upon in a timely manner. It also requires real-time evaluation and action in case of the Internet of Things(IoT) applications. Variety – Variety refers to different formats of data. It may be structured, unstructured or semi-structured. The data can be audio, video, text or email. In this additional processing is required to derive the meaning of data and also to support the metadata. In addition to these three Vs of data, following Vs are also defined in big data. Value – Each form of data has some value which needs to be discovered. There are certain qualitative and quantitative techniques to derive meaning from data. For deriving value from data, certain new discoveries and techniques are required. Variability – Another dimension for big data is the variability of data i.e the flow of data can be high or low. There are challenges in managing this flow of data.

Thesis Research Topics in Big Data

  • Privacy, Security Issues in Big Data .
  • Storage Systems of Scalable for Big Data .
  • Massive Big Data Processing of Software and Tools.
  • Techniques and Data Mining Tools for Big Data .
  • Big Data Adoptation and Analytics of Cloud Computing Platforms.
  • Scalable Architectures for Parallel Data Processing.

Can you imagine how big is big data? Of course, you can’t. The amount of big data that is generated and stored on a global scale is unbelievable and is growing day by day. But do you know, only a small portion of this data is actually analyzed mainly for getting useful insights and information?

Big Data Hadoop

Hadoop is an open-source framework provided to process and store big data. Hadoop makes use of simple programming models to process big data in a distributed environment across clusters of computers. Hadoop provides storage for a large volume of data along with advanced processing power. It also gives the ability to handle multiple tasks and jobs.

Big Data Hadoop Architecture

HDFS is the main component of Hadoop architecture. It stands for Hadoop Distributed File Systems. It is used to store a large amount of data and multiple machines are used for this storage. MapReduce Overview is another component of big data architecture. The data is processed here in a distributed manner across multiple machines. YARN component is used for data processing resources like CPU, RAM, and memory. Resource Manager and Node Manager are the elements of YARN. These two elements work as master and slave. Resource Manager is the master and assigns resources to the slave i.e. Node Manager. Node Manager sends the signal to the master when it is going to start the work. Big Data Hadoop for the thesis will be plus point for you.

thesis topics about big data

Importance of Hadoop in big data

Hadoop is essential especially in terms of big data . The importance of Hadoop is highlighted in the following points: Processing of huge chunks of data – With Hadoop, we can process and store huge amount of data mainly the data from social media and IoT(Internet of Things) applications. Computation power – The computation power of Hadoop is high as it can process big data pretty fast. Hadoop makes use of distributed models for processing of data. Fault tolerance – Hadoop provide protection against any form of malware as well as from hardware failure. If a node in the distributed model goes down, then other nodes continue to function. Copies of data are also stored. Flexibility – As much data as you require can be stored using Hadoop. There is no requirement of preprocessing the data. Low Cost – Hadoop is an open-source framework and free to use. It provides additional hardware to store the large quantities of data. Scalability – The system can be grown easily just by adding nodes in the system according to the requirements. Minimal administration is required.

Challenges of Hadoop

No doubt Hadoop is a very good platform for big data solution, still, there are certain challenges in this.

These challenges are:

  • All problems cannot be solved – It is not suitable for iteration and interaction tasks. Instead, it is efficient for simple problems for which division into independent units can be made.
  • Talent Gap – There is a lack of talented and skilled programmers in the field of MapReduce in big data especially at entry level.
  • Security of data – Another challenge is the security of data. Kerberos authentication protocol has been developed to provide a solution to data security issues.
  • Lack of tools – There is a lack of tools for data cleaning, management, and governance. Tools for data quality and standardization are also lacking.

Fields under Big Data

Big Data is a vast field and there are a number of topics and fields under it on which you can work for your thesis, dissertation as well as for research. Big Data is just an umbrella term for these fields.

Search Engine Data – It refers to the data stored in the search engines like Google, Bing and is retrieved from different databases. Social Media Data – It is a collection of data from social media platforms like Facebook, Twitter. Stock Exchange Data – It is a data from companies indulged into shares business in the stock market. Black box Data – Black Box is a component of airplanes, helicopters for voice recording of fight crew and for other metrics.

Big Data Technologies

Big Data technologies are required for more detailed analysis, accuracy and concrete decision making. It will lead to more efficiency, less cost, and less risk. For this, a powerful infrastructure is required to manage and process huge volumes of data.

The data can be analyzed with techniques like A/B Testing, Machine Learning, and Natural Language Processing.

The big data technologies include business intelligence, cloud computing, and databases.

The visualization of data can be done through the medium of charts and graphs.

Multi-dimensional big data can be handled through tensor-based computation. Tensor-based computation makes use of linear relations in the form of scalars and vectors. Other technologies that can be applied to big data are:

Massively Parallel Processing Search based applications Data Mining Distributed databases Cloud Computing

These technologies are provided by vendors like Amazon, Microsoft, IBM etc to manage the big data.

MapReduce Algorithm for Big Data

A large amount of data cannot be processed using traditional data processing approaches. This problem has been solved by Google using an algorithm known as the MapReduce algorithm. Using this algorithm, the task can be divided into small parts and these parts are assigned to distributed computers connected on the network. The data is then collected from individual computers to form a final dataset.

The MapReduce algorithm is used by Hadoop to run applications in which parallel processing of data is done on different nodes. Hadoop framework can develop applications that can run on clusters of computers to perform statistical analysis of a large amount of data.

The MapReduce algorithm consist of two tasks: Map Reduce

A set is of data is taken by Map which is converted into another set of data in which individual elements are broken into pairs known as tuples. Reduce takes the output of Map task as input. It combines data tuples into smaller tuples set.

The MapReduce algorithm is executed in three stages: Map Shuffle Reduce

In the map stage, the input data is processed and stored in the Hadoop file system(HDFS). After this a mapper performs the processing of data to create small chunks of data. Shuffle stage and Reduce stage occur in combination. The Reducer takes the input from the mapper for processing to create a new set of output which will later be stored in the HDFS. The Map and Reduce tasks are assigned to appropriate servers in the cluster by the Hadoop. The Hadoop framework manages all the details like issuing of tasks, verification, and copying. After completion, the data is collected at the Hadoop server. You can get thesis and dissertation guidance for the thesis in Big Data Hadoop from data analyst.

Applications of Big Data

Big Data find its application in various areas including retail, finance, digital media, healthcare, customer services etc.

Big Data is used within governmental services with efficiency in cost, productivity, and innovation. The common example of this is the Indian Elections of 2014 in which BJP tried this to win the elections. The data analysis, in this case, can be done by the collaboration between the local and the central government. Big Data was the major factor behind Barack Obama’s win in the 2012 election campaign.

Big Data is used in finance for market prediction. It is used for compliance and regulatory reporting, risk analysis, fraud detection, high-speed trading and for analytics. The data which is used for market prediction is known as alternate data.

Big Data is used in health care services for clinical data analysis, disease pattern analysis, medical devices and medicines supply, drug discovery and various other such analytics. Big Data analytics have helped in a major way in improving the healthcare systems. Using these certain technologies have been developed in healthcare systems like eHealth, mHealth, and wearable health gadgets.

Media uses Big Data for various mechanisms like ad targeting, forecasting, clickstream analytics, campaign management and loyalty programs. It is mainly focused on following three points:

Targeting consumers Capturing of data Data journalism

Big Data is a core of IoT(Internet of Things) . They both work together. Data can be extracted from IoT devices for mapping which helps in interconnectivity. This mapping can be used to target customers and for media efficiency by the media industry.

Information Technology

Big Data has helped employees working in Information Technology to work efficiently and for widespread distribution of Information Technology. Certain issues in Information Technology can also be resolved using Big Data. Big Data principles can be applied to machine learning and artificial intelligence for providing better solutions to the problems.

Advantages of Big Data

Big Data has certain advantages and benefits, particularly for big organizations.

  • Time Management – Big data saves valuable time as rather than spending hours on managing the different amount of data, big data can be managed efficiently and at a faster pace.
  • Accessibility – Big Data is easily accessible through authorization and data access rights and privileges.
  • Trustworthy – Big Data is trustworthy in the sense that we can get valuable insights from the data.
  • Relevant – The data is relevant whereas irrelevant data require filtering which can lead to complexity.
  • Secure – The data is secured using data hosting and through various advanced technologies and techniques.

Challenges of Big Data

Although Big Data has come in a big way in improving the way we store data, there are certain challenges which need to be resolved.

  • Data Storage and quality of Data – The data is growing at a fast pace as the number of companies and organizations are growing. Proper storage of this data has become a challenge. This data can be stored in data warehouses but this data is inconsistent. There are issues of errors, duplicacy, conflicts while storing this data in their native format. Moreover, this changes the quality of data.
  • Lack of big data analysts – There is a huge demand for data scientists and analysts who can understand and analyze this data. But there are very few people who can work in this field considering the fact that huge amount of data is produced every day. Those who are there don’t have proper skills.
  • Quality Analysis – Big companies and organizations use big for getting useful insights to make proper decisions for future plans. The data should also be accurate as inaccurate data can lead to wrong decisions that will affect the company business. Therefore quality analysis of the data should be there. For this testing is required which is a time-consuming process and also make use of expensive tools.
  • Security and Privacy of Data – Security, and privacy are the biggest risks in big data. The tools that are used for analyzing, storing, managing use data from different sources. This makes data vulnerable to exposure. It increases security and privacy concerns.

Thus Big Data is providing a great help to companies and organizations to make better decisions. This will ultimately lead to more profit. The main thesis topics in Big Data and Hadoop include applications, architecture, Big Data in IoT, MapReduce, Big Data Maturity Model etc.

Latest Thesis and Research Topics in Big Data

There are a various thesis and research topics in big data for M.Tech and Ph.D. Following is the list of good topics for big data for masters thesis and research:

Big Data Virtualization

Internet of Things(IoT)

Big Data Maturity Model

Data Science

Data Federation

Big Data Analytics

SQL-on-Hadoop

Predictive Analytics

Big Data Virtualization is the process of creating virtual structures rather than actual for Big Data systems. It is very beneficial for big enterprises and organizations to use their data assets to achieve their goals and objectives. Virtualization tools are available to handle big data analytics.

Big Data and IoT work in coexistence with each other. IoT devices capture data which is extracted for connectivity of devices. IoT devices have sensors to sense data from its surroundings and can act according to its surrounding environment.

Big Data Maturity Models are used to measure the maturity of big data. These models help organizations to measure big data capabilities and also assist them to create a structure around that data. The main goal of these models is to guide organizations to set their development goals.

Data Science is more or less related to Data Mining in which valuable insights and information are extracted from data both structured and unstructured. Data Science employs techniques and methods from the fields of mathematics, statistics, and computer science for processing.

Data Federation is the process of collecting data from different databases without copying and without transferring the original data. Rather than whole information, data federation collects metadata which is the description of the structure of original data and keep them in a single database.

Sampling is a technique of statistics to find and locate patterns in Big Data. Sampling makes it possible for the data scientists to work efficiently with a manageable amount of data. Sampled data can be used for predictive analytics. Data can be represented accurately when a large sample of data is used.

It is the process of exploring large datasets for the sake of finding hidden patterns and underlying relations for valuable customer insights and other useful information. It finds its application in various areas like finance, customer services etc. It is a good choice for Ph.D. research in big data analytics.

Clustering is a technique to analyze big data. In clustering, a group of similar objects is grouped together according to their similarities and characteristics. In other words, this technique partitions the data into different sets. The partitioning can be hard partitioning and soft partitioning. There are various algorithms designed for big data and data mining. It is a good area for thesis and researh in big data.

SQL-on-Hadoop is a methodology for implementing SQL on Hadoop platform by combining together the SQL-style querying system to the new components of the Hadoop framework. There are various ways to execute SQL in Hadoop environment which include – connectors for translating the SQL into a MapReduce format, push down systems to execute SQL in Hadoop clusters, systems that distribute the SQL work between MapReduce – HDFS clusters and raw HDFS clusters. It is a very good topic for thesis and research in Big Data.

It is a technique of extracting information from the datasets that already exist in order to find out the patterns and estimate future trends. Predictive Analytics is the practical outcome of Big Data and Business Intelligence(BI). There are predictive analytics models which are used to get future insights. For this future insight, predictive analytics take into consideration both current and historical data. It is also an interesting topic for thesis and research in Big Data.

These were some of the good topics for big data for M.Tech and masters thesis and research work. For any help on thesis topics in Big Data, contact Techsparks . Call us on this number 91-9465330425  or email us at [email protected] for M.Tech and Ph.D. help in big data thesis topics.

Click on the following link to download the latest thesis and research topics in Big Data

Latest Thesis and Research Topics in Big Data(pdf)

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

Quick Enquiry

Get a quote, share your details to get free.

BIG DATA MASTER THESIS

“Big data” can be generally defined as large arrival of data volume, variety, and velocities of data resources which enable cost-effective, creative data analysis for improved insight as well as visualization. In practice, big data size is changing gradually. It must permit the diverse types of inputs to be completely assimilated and evaluated to assist us in drawing conclusions.  This article provides you deep insight into the Big Data master thesis where you can get to cover all aspects needed to do big data research and thesis work effectively.  Let us first start with understanding the various processes in big data, 

Big Data Master Thesis Writing Service from PhD Writers

Big Data Processes

  • Data acquisition
  • Recording of information
  • Annotations
  • Data representation
  • Feature extraction
  • Data analytics
  • Interpretation

What are big data analysis techniques? 

  • Identifying data entities and redundancies
  • Processing the missing and abnormal values
  • Processing the skewness
  • Standardization and discretization of data
  • Constructing attributes

More particular applications of Big Data in real life can be found on our website from which you can better understand these processes. Big data master thesis is the most sought-after research assistance service in the world, with students and researchers from renowned universities rushing to us. We are able to deliver the most trustworthy and comprehensive research help in Big Data thanks to our updated technical team of professionals. Let us now discuss the recent research directions of big data

What are the current directions of Big Data analytics?

  • Collecting transforming and analyzing big data with the support of data center
  • Integrating information from multiple sources along with the distribution of data for the purpose of management and computation. 
  • Utilising the attributes like machine learning mechanism and data mining for large scale analysis of data
  • Tools of data visualization are used along with machine learning techniques
  • The concerns of privacy and security in big data are handled efficiently using the statistical theory and big data sampling processes

You can reach out to our experts for any of these areas of big data research . Big data is widely regarded as the most important technological advancement in today’s digital world. Contact us if you’re looking for a vast repository of research-related data drawn from real-time big data platforms. Let us now look into ongoing research areas in big data

Our ongoing activities in big data

  • Basic theory study, analysis, and development
  • Working with advanced techniques, methods, and algorithms
  • Developing advanced techniques based on the latest technologies in order to enhance the efficiency of big data applications and solve many big data issues
  • Enabling research scholars and students from all over the world to interact with big data researchers and scientists to integrate new technologies to carve out innovations
  • Advanced technologies and methodologies for being developed to solve potential problems in Big Data analytics

Despite the fact that Big Data appears to be a big topic that would require several books and programs to cover, our developers are now focusing on the fundamentals of Big Data so that students understand what else to think about when digging deep into Big Data algorithms and strategies . Let us now look into the major demands of big data

What are the requirements of big data models?

  • Novel applications, techniques, and advanced solutions for creating a positive impact in big data research
  • New big data model for real-time data analysis and processing with enhanced security features to ensure privacy and secrecy of data. 

Here are a few of the world’s top technical specialists who have been working with Big Data projects from their inception. Let us now discuss the significance of the research.

What is the purpose of a research project?

Every research work has its own significance. But each one cannot be implemented in the real world. Here are many assumptions, hypotheses, and establishments that have the capacities to be worked out beyond. Almost all science-based imaginations and fiction are becoming reality. The following are the important aspects in which privacy and security policies have to be given priority

  • Agricultural, logistic and financial data
  • Sensor, web, and city data
  • Integration/fusion of data for decision making
  • Mining and visualization of data
  • Utilising big data analytics in real-time

        For instance, 

  • Smart cities management 

We have experienced qualified and professional engineers and skilled writers who have earned world-class certification to provide you with full support in all of these areas. In Big data master thesis, we utilize a systematic plan to maintain proper shape and consistency in the language of the scholarly work. All of your ideas, points of view, and references will be organized logically. Let’s look at some massive data processing techniques now.

Real-time applications of Big Data analytics

  • Data obtained from location and GPS
  • Integrated personal information from satellite images
  • Call data records
  • Enables reliable tracking of location and proper recommendation of routes
  • Most useful in routing drones for applications in military, emergency situations and identifying infections
  • Location information
  • Determining the mobility pattern across the globe for containment of infectious diseases and planning transportation
  • Data on location and consumption pattern
  • Data from a smart meter, history of usage, and gas status
  • Promoting the use of green energy by increasing conservation
  • Establishing use efficiency by predicting energy consumption rate
  • Record of patients’ data and electronic health record
  • Health history data, X-rays, and images
  • Enhance the health monitoring purposes and used in studying patient’s immune response
  • Recommendation of activity for maintaining physical health and elderly people
  • Data on network signal strength and network user information
  • Geolocation and sensor data
  • Network log activities, video camera data, and weblogs
  • It is used in effective network signaling and network dynamics prediction
  • Management of networks and cell deployment data generation
  • Log and social media data
  • Product reviews, tweets, and blog posts
  • User service recommendations which are effective and efficient
  • Online surveys and questionnaires, ECG, EMG, and pulse rate
  • Data sensings like gyroscopes, accelerometer, and magnetometer
  • Utilising smartphones and other online network frameworks for collecting and analyzing data on a large scale
  • Selection and review of products 
  • Location and data buying behavioral analysis
  • Reviews on customer products and help in analyzing product’s strengths and weaknesses

There are also many more important big data applications in real-time specific to the requirements. Speak with one of our technical specialists about the practices we implemented to improve the effectiveness of our Big Data programs . Because we stick to a zero plagiarism standard, our writers promise that there’ll be no duplication in the final edition of your thesis that we prepare. We guarantee a thorough grammatical verification, internal review, and on-time submission . Let us now discuss the integrated and upgraded big data methods in further detail.

What are the technologies used in Big Data analytics? 

  • Data retrieval, mining, analytics, and distribution
  • Massive parallelism, machine learnin g, and AI 
  • High-speed networking and high-performance computation
  • Hadoop, Spark-based big data analytics technologies 

For quantitative, analytical, theoretical, and coding platforms related to all these methodologies, you can approach us for great big data master thesis writing . Our professionals can explain everything about Big Data and answer all of your questions at once. Let’s now get into the different types of Big Data tools

Best Big Data Management Tools

NoSQL provides for non-relational database for the purpose of storing wearing and managing data that is both unstructured and structured.  It does not need normalization and application porting integration .  Computational overhead is reduced big data distribution across different hosts led by elastic scaling .  The following are the important NoSQL-based tools in managing big data storage systems. 

  • It is a highly reliable system for storing large volumes of data with fault tolerance
  • It is used in reading the data once and interpreting it for writing many times by consuming minimal storage

For the pros and cons of these tools, you can get in touch with us at any time. The following are the major tools in managing the big database

  • It is one of the important Hadoop tools for enabling machine learning and real-time data processing
  • The tool is significantly used in operations of reading and writing, batch processing, joining streams, node failure handling, and many more
  • Inbuilt applications in Spark is used in implementing many common programming languages
  • Summarising, analysis of queries and data with SQL interface is one of the biggest advantages provided by Apache hive
  • It facilitates and helps in maintaining the writing with the use of approaches like indexing
  • It provides a data storage facility by and column-based data 
  • Large datasets storage that located at the top of HDFS 
  • It provides for aggregating and analyzing datasets with multiple rows in a very less period
  • Analysis of large generator datasets is made easier 
  • It provides increased performance, throughput and the response time is also quick
  • It is an RDBMS data import and export tool
  • Time for processing data is reduced by providing a mechanism for computational offloading

Once you reach out to us, we will provide you with a huge amount of standard and very precise research data regarding the use of these tools. Let us now look into some of the important tools that are used in big data processing mechanisms, 

  • For data extraction from and to Hadoop, the flume is used
  • HDFS data streaming by easy to use and flexible framework leading to efficient aggregation
  • It is an important tool used in handling streaming functions and batches
  • It is a highly efficient real-time analysis tool used in Hadoop based distributed stream processing
  • By using distributed snapshots this tool provides increased performance in data operation by enabling fault tolerance
  • It also provides an integrated runtime environment for batch processing and data streaming applications
  • Hadoop cluster job parallelization tool that works by enabling coordination and workflow
  • Multiple job execution with fault tolerance is allowed by this tool
  • It is also used in seamless job control in web service APIs
  • It is an important tool used in job management computation and scheduling of resources
  • It is a programming framework based on Hadoop used in batch processing
  • It can store a huge volume of distributed data in a cost-effective manner and so its scalability is also very high
  • It is a tool that provides a proper Framework for processing data which is used in defining the workflow
  • It also gives execution steps using a proper acyclic graphical representation
  • Its interface is very simple and can be used in very fast data processing applications
  • Switching from the MapReduce platform is also enabled by this tool
  • It is one of the important large data processing tools used in clustering, classifying, regression, collaborative filtration, segmenting, and statistical modeling applications
  • It is useful in complementing applications that involve the use of distributed data mining
  • This tool is used in Hadoop based allocation of resources and scheduling jobs
  • Hadoop 2.0 mechanism forms the basis of this tool which is used in managing resources and metadata maintenance while at the same time tracking user data
  • Efficient resource utilization by adding YARN into Hadoop and higher data availability is provided by this tool

Any Big Data system’s success is largely determined by its tools and algorithm. Algorithms are used to regulate, find, and build the cognitive models of a Big Data system . One of the most significant functions of Big Data algorithms is to extract valuable information and analyze them for arriving at results . As a reason, in order to write the best code and programming for your big data projects , you’ll need to expand your skills in all major programming languages. Let’s have a look at some of the most essential big data programming languages in this area.

Latest Top 5 Big Data Master Thesis Topics

Top 3 programming languages for Big Data analytics

  • It is a general-purpose programming language that consists of a large number of open-source packages used for the following purposes
  • Data modeling, pre-processing, mining, and computation
  • Machine learning, analysis of network graphs, and processing natural languages
  • It is a highly user-friendly and object-oriented programming language that is well known for its flexible and supportive aspects that allows it to integrate with various other platforms for big data processing like Apache spark
  • It is one of the common open-source programming languages used in data analysis and visualization
  • It is also highly significant in handling complicated data as it provides for efficient storage systems and performing vector operations
  • It is useful in performing all the following popular data related functions in a more efficient manner
  • Reading and writing data into the memory
  • Data cleansing, storage, visualization, mining, and machine learning
  • It is one of the important tools in carrying out big data analytics and processing
  • Apache Spark provides for complicated app development platform using multiple programming languages with Java enabled virtual machine-based data processing
  • It is used in scala supported big data processing, analysis, and management
  • It enables simple, quick, inherently immutable applications which reduces highly threaded security in the same kind of languages

You can surely get full support on all these tools and programming languages from us . Our professionals usually offer utmost priority to all of the vital parts of these Big Data research fields so that consumers can pleasantly execute their exploration . Our writers are likewise extremely organized about following your institution’s formatting rules and norms. You can therefore experience our services more confidently. 

We are helping individuals to carve out customized big data systems for their enhancement. We have got qualified teams of research experts, writers, developers, engineers, and technical teams to assist you in all aspects of your big data master thesis. We will look into the important stages in master thesis writing

Main Stages of writing a master’s thesis

Writing the best thesis is one of the important aspects to showcase your field knowledge, talent, and innovation thus, in turn, attracting a huge volume of readers. In this regard, our expert writers have been providing all the necessary resources and support to our customers in writing one of the best thesis works in any big data master thesis topic . In the following, you can find some important aspects of a master thesis

  • Choose one of the most interesting and recent topics
  • Try to create a holistic proposal
  • Utilise all the relevant resources to carry out the research
  • Give utmost importance to proofreading, checking, and formatting
  • Have brief talks and detailed discussions with your guide and mentor regarding the content

As a result, you may want the assistance of professionals in the subject in order to begin your Big Data Master Thesis. We have links with experts from the world’s best firms, institutes, and academics; therefore we are well-versed in the technical aspects of contemporary Big Data research. Hence you can have all your research needs to be met in one place. Let us now talk about some important thesis topics in big data,  

Top 6 Big Data Master Thesis Topics

  • Data retrieval based on queries
  • Social network sentiment analysis both offline and online
  • Correlated big data analysis for protecting the privacy
  • Preserving the privacy and ensuring the security of big data users
  • Big spatial data similarity search
  • Allocation of resources in Big Data System with elevated security awareness

These are some of the most popular and current study areas in the field of big data. For any type of research support, including PhD proposals, dissertation writing help , paper publishing, assignments, producing literature reviews, and big data master thesis, feel free to contact our developers. We are happy to help you.

Why Work With Us ?

Senior research member, research experience, journal member, book publisher, research ethics, business ethics, valid references, explanations, paper publication, 9 big reasons to select us.

Our Editor-in-Chief has Website Ownership who control and deliver all aspects of PhD Direction to scholars and students and also keep the look to fully manage all our clients.

Our world-class certified experts have 18+years of experience in Research & Development programs (Industrial Research) who absolutely immersed as many scholars as possible in developing strong PhD research projects.

We associated with 200+reputed SCI and SCOPUS indexed journals (SJR ranking) for getting research work to be published in standard journals (Your first-choice journal).

PhDdirection.com is world’s largest book publishing platform that predominantly work subject-wise categories for scholars/students to assist their books writing and takes out into the University Library.

Our researchers provide required research ethics such as Confidentiality & Privacy, Novelty (valuable research), Plagiarism-Free, and Timely Delivery. Our customers have freedom to examine their current specific research activities.

Our organization take into consideration of customer satisfaction, online, offline support and professional works deliver since these are the actual inspiring business factors.

Solid works delivering by young qualified global research team. "References" is the key to evaluating works easier because we carefully assess scholars findings.

Detailed Videos, Readme files, Screenshots are provided for all research projects. We provide Teamviewer support and other online channels for project explanation.

Worthy journal publication is our main thing like IEEE, ACM, Springer, IET, Elsevier, etc. We substantially reduces scholars burden in publication side. We carry scholars from initial submission to final acceptance.

Related Pages

Our benefits, throughout reference, confidential agreement, research no way resale, plagiarism-free, publication guarantee, customize support, fair revisions, business professionalism, domains & tools, we generally use, wireless communication (4g lte, and 5g), ad hoc networks (vanet, manet, etc.), wireless sensor networks, software defined networks, network security, internet of things (mqtt, coap), internet of vehicles, cloud computing, fog computing, edge computing, mobile computing, mobile cloud computing, ubiquitous computing, digital image processing, medical image processing, pattern analysis and machine intelligence, geoscience and remote sensing, big data analytics, data mining, power electronics, web of things, digital forensics, natural language processing, automation systems, artificial intelligence, mininet 2.1.0, matlab (r2018b/r2019a), matlab and simulink, apache hadoop, apache spark mlib, apache mahout, apache flink, apache storm, apache cassandra, pig and hive, rapid miner, support 24/7, call us @ any time, +91 9444829042, [email protected].

Questions ?

Click here to chat with us

analytics insight

Analytics Insight

Compelling Thesis Topics in the Field of Data Science 2024

' src=

Dynamic Thesis Topics Propelling Data Science into 2024’s Technological Frontier

As the realm of data science continues to evolve, students seeking to make their mark in this dynamic field are confronted with the challenge of selecting thesis topics that are not only relevant but also hold the promise of contributing significantly to the discipline. In 2024, the landscape of data science is marked by a fusion of emerging technologies, ethical considerations, and real-world applications. In this article, we explore ten compelling thesis topics that encapsulate the essence of contemporary data science.

Deep Learning: Unraveling the Depths of Neural Networks:

Deep learning remains at the forefront of data science, driving advancements in image recognition, natural language processing, and more. A thesis in this domain could delve into optimizing deep learning architectures, exploring transfer learning applications, or investigating the interpretability of complex neural networks.

Exploratory Data Analysis (EDA): Navigating the Data Wilderness:

EDA is the compass that guides data scientists through uncharted territories. A thesis on exploratory data analysis could focus on developing innovative EDA techniques, integrating visualizations for deeper insights, or applying EDA methodologies to specific industries such as healthcare or finance.

Fake News Detection: The Battle Against Information Manipulation:

In an era dominated by information, combating fake news is paramount. A thesis in fake news detection could explore novel machine learning algorithms , examine the role of social media in spreading misinformation, or propose frameworks for automated verification and fact-checking.

Chatbot Revolution: Bridging the Human-Machine Communication Gap:

Chatbots have become ubiquitous, transforming customer service and user engagement. A thesis on chatbots could investigate natural language processing algorithms, assess user experience in chatbot interactions, or explore ethical considerations in the deployment of conversational agents.

Credit Card Fraud Detection: Safeguarding Financial Transactions:

As digital transactions surge, the need for robust fraud detection systems intensifies. A thesis in credit card fraud detection could explore anomaly detection methods, leverage machine learning for real-time monitoring, or investigate the impact of imbalanced datasets on fraud prediction models.

Data Visualization: Painting Insights with Data:

Data visualization is the art of storytelling in the data science realm. A thesis on data visualization could delve into the design principles for effective visualizations, explore the impact of storytelling in conveying data insights, or assess the accessibility of visualizations for diverse audiences.

Natural Language Processing (NLP): Decoding the Language of Machines:

Natural Language Processing (NLP) constitutes the core of language-centric applications, ranging from sentiment analysis to language translation. A thesis in NLP could explore advanced language models, sentiment analysis techniques, or the ethical implications of language processing in applications like virtual assistants.

Quantum Computing for Big Data Analytics: Bridging Classical and Quantum Realms:

The integration of quantum computing and big data analytics presents transformative potential with profound implications for various industries. A thesis in this domain could explore quantum algorithms for data analysis, assess the scalability of quantum computing in handling massive datasets, or investigate hybrid models that leverage both classical and quantum computing resources.

Scalable Architectures for Parallel Data Processing: Navigating the Data Deluge:

 As data volumes grow exponentially, scalable architectures are essential for efficient data processing. A thesis in scalable architectures could explore distributed computing frameworks, assess the performance of parallel processing in handling diverse data types, or propose innovative solutions for real-time data processing.

Sentiment Analysis: Deciphering Emotions in the Digital Era:

Understanding public sentiment is vital in various domains, from marketing to politics. A thesis in sentiment analysis could delve into advanced sentiment classification models, explore cross-cultural sentiment variations, or investigate the impact of sentiment analysis on decision-making processes.

Conclusion:

The field of data science in 2024 is characterized by a convergence of cutting-edge technologies and the imperative to address real-world challenges. The ten compelling thesis topics outlined above offer students the opportunity to embark on a journey of exploration and innovation. Whether unravelling the intricacies of deep learning, combating misinformation, or navigating the vast landscape of data visualization, each topic represents a gateway to making a meaningful contribution to the ever-evolving field of data science. As students embark on their thesis endeavors, these topics provide a roadmap to the pinnacle of data science in 2024.

Whatsapp Icon

Disclaimer: Any financial and crypto market information given on Analytics Insight are sponsored articles, written for informational purpose only and is not an investment advice. The readers are further advised that Crypto products and NFTs are unregulated and can be highly risky. There may be no regulatory recourse for any loss from such transactions. Conduct your own research by contacting financial experts before making any investment decisions. The decision to read hereinafter is purely a matter of choice and shall be construed as an express undertaking/guarantee in favour of Analytics Insight of being absolved from any/ all potential legal action, or enforceable claims. We do not represent nor own any cryptocurrency, any complaints, abuse or concerns with regards to the information provided shall be immediately informed here .

You May Also Like

ble-Online-for-Aspiring-AI-Enthusiasts

5 Best AI Courses Available Online for Aspiring AI Enthusiasts

InQubeta

BlackRock Bitcoin ETF Will Send InQubeta To The Moon

thesis topics about big data

Top Technology Trends That Are Driving 2018

Floki Inu

Experts Say Floki Inu and This New Memecoin Are Ready to Pump 30x This Year

AI-logo

Analytics Insight® is an influential platform dedicated to insights, trends, and opinion from the world of data-driven technologies. It monitors developments, recognition, and achievements made by Artificial Intelligence, Big Data and Analytics companies across the globe.

linkedin

  • Select Language:
  • Privacy Policy
  • Content Licensing
  • Terms & Conditions
  • Submit an Interview

Special Editions

  • Dec – Crypto Weekly Vol-1
  • 40 Under 40 Innovators
  • Women In Technology
  • Market Reports
  • AI Glossary
  • Infographics

Latest Issue

40 Under 40 2024

Disclaimer: Any financial and crypto market information given on Analytics Insight is written for informational purpose only and is not an investment advice. Conduct your own research by contacting financial experts before making any investment decisions, more information here .

Second Menu

thesis topics about big data

PhD Thesis Blog

Thesis and code, 5 trending phd research topics in big data.

In the past decade, Big Data has emerged as a powerful technology tool and is growing in leaps and bounds. There are a number of industry sectors in which PhD research is being conducted for Big Data, including Ecommerce, banking, insurance, telecom, and the health sector.

There are a number of quality research programs being pursued by PhD scholars on the vast and growing field of Big Data. While the maximum number of Big Data research papers is in the field of computer science (171), other academic fields for this line of research include Engineering (75), Mathematics (33), and Business Management (26).

Listed below are the 5 trending research topics being pursued by PhD scholars around the globe:

  • Big Data analytics

Big Data analytics tool has emerged as a powerful tool used to harness the potential use of big data for industry-specific uses. A number of E-commerce retailers are using analytics for online sales conversion and determining customer behaviour. Other potential use is in the performance improvement of sporting athletes.

  • Improving the quality of healthcare

Currently, research is being conducted in the areas of drug discovery, drug response, bioinformatics, clinical data analysis, and public health data. According to the Mckinsey report on Big Data in 2011, Big Data has the potential of reducing the US national health care costs by around 8%.

  • Data visualization

Big Data users are able to see and analyse big data sets using much improved visualization tools. The advent of touch-sensitive navigation has brought huge improvements in interactive visualization technology.

  • Hadoop framework

Research on Apache Hadoop framework is aimed at developing software applications that can be deployed on a larger and distributed network. Deployed across network clusters, the Hadoop framework has been used by a host of popular web platforms including Twitter, LinkedIn, Amazon, and Facebook. Other research topics include the MapReduce programming model, used for executing code for processing large amounts of data over distributed network clusters.

  • Distributed Storage systems

Other areas of PhD research include the efficient way of storing volumes of data over large-scale distributed network clusters. Examples include the Google File System used for storing high-data applications over distributed systems, and Bigtable used for structured storage of Big data.

The constant evolution of Big Data presents researchers with dynamic challenges, while also presenting them with opportunities of determining the evolution of science.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

thesis topics about big data

  • Articles , Special Edition Articles , Thesis

50 Best Thesis Topics for Big Data in Urban Planning

  • May 8, 2023
  • Transportation Planning , Urban design thesis , Urban Planning , urban research

thesis topics about big data

What is Big data in urban planning?

The term “big data” in urban planning refers to the use of expansive, intricate databases to recognise and effectively handle urban issues. This information can be gathered from a variety of sources, including sensors, social media, and public records. It is frequently analysed using cutting-edge methods like machine learning and data mining. By spotting patterns and trends that would be challenging to spot using conventional techniques, big data in urban planning aims to build more effective, sustainable, and livable cities. This strategy has the potential to revolutionise urban planning by giving decision-makers access to a plethora of data that may help them make decisions about, among other things, public services, housing, and transportation.

Postgraduate program in Big data in urban planning

Urban planning is a rapidly expanding area, and students who complete a master’s degree programme focused on big data will be well-equipped to take advantage of this trend. Solutions to problems in urban planning, such as traffic congestion, air pollution, and sprawl, are a primary focus of the programme. Modern methods of data analysis and interpretation are taught, and the students then apply their knowledge to the problem of making cities better places to live. The curriculum prepares students for employment in urban planning, data analysis, and policy making.

UDL Thesis Publication 2024

Curating the best thesis Globally !

Big Data in Urban Planning Thesis Topics List:

  • Analyzing the impact of big data on urban planning processes
  • Using big data to identify patterns in urban transportation
  • Integrating big data into real-time urban traffic management systems
  • Investigating the effectiveness of big data in urban disaster management
  • Exploring the use of big data in urban design decision-making
  • Evaluating the potential of big data in optimizing urban energy consumption
  • Developing big data-based predictive models for urban development scenarios
  • Examining the ethical considerations of using big data in urban planning
  • Investigating the use of big data in identifying and mitigating urban environmental risks
  • Analyzing the role of big data in urban mobility planning

50 Best Thesis Topics for Big Data in Urban Planning 3

  • Developing big data-based approaches to urban air quality monitoring and management
  • Exploring the potential of big data in urban water management
  • Evaluating the use of big data in enhancing urban public safety
  • Analyzing the impact of big data on urban land-use planning
  • Investigating the role of big data in urban economic development
  • Developing big data-based approaches to urban infrastructure planning
  • Exploring the use of big data in urban housing policy and planning
  • Analyzing the potential of big data in promoting sustainable urban development
  • Investigating the use of big data in urban health policy and planning
  • Developing big data-based approaches to urban waste management
  • Evaluating the role of big data in urban green space planning and management
  • Analyzing the impact of big data on urban cultural heritage preservation
  • Investigating the use of big data in urban tourism planning
  • Developing big data-based approaches to urban noise pollution management
  • Exploring the use of big data in urban food security planning
  • Analyzing the potential of big data in urban resilience planning
  • Investigating the use of big data in urban education policy and planning
  • Developing big data-based approaches to urban climate change mitigation and adaptation
  • Evaluating the role of big data in urban zoning and land-use regulations
  • Analyzing the impact of big data on urban water supply planning

50 Best Thesis Topics for Big Data in Urban Planning 7

Investigating the use of big data in urban crime prevention and law enforcement

Investigating the use of big data in urban street network planning and design

Developing big data-based approaches to urban biodiversity conservation

Evaluating the use of big data in urban transportation demand management

Analyzing the impact of big data on urban public participation in decision-making

Investigating the role of big data in urban noise pollution management

Developing big data-based approaches to urban infrastructure asset management

Exploring the use of big data in urban social inequality analysis and policy-making

Analyzing the potential of big data in urban public service provision

Developing big data-based approaches to urban accessibility planning

Urban Planning

Exploring the use of big data in urban accessibility and mobility for people with disabilities

Exploring the use of big data in urban water resource management

Analyzing the potential of big data in urban traffic safety planning and management

Investigating the use of big data in urban neighborhood revitalization

Developing big data-based approaches to urban pedestrian and bicycle infrastructure planning

Evaluating the use of big data in urban job creation and economic growth

Analyzing the impact of big data on urban public space design and management

Investigating the role of big data in urban social network analysis and visualization

Developing big data-based approaches to urban historic preservation

Analyzing the potential of big data in urban disaster risk reduction

50 Best Thesis Topics for Big Data in Urban Planning 10

Urban Design Lab

About the author.

This is the admin account of Urban Design Lab. This account publishes articles written by team members, contributions from guest writers, and other occasional submissions. Please feel free to contact us if you have any questions or comments.

UDL Thesis Publication 2023

Related articles.

thesis topics about big data

The Others Nest | Unleashing Urban Wonder in Chengdu

thesis topics about big data

La Barcelonita | A playground for social & commercial activity

thesis topics about big data

The Orange Moon: Revitalizing Seoul’s Riverfront

Curating the Best Thesis Projects Globally !

Leave a Reply

Udl photoshop, masterclass.

Decipher the secrets of

Urban Mapping and 3D Visualisation

Session Dates

4th-5th May, 2024

thesis topics about big data

UDL Thesis Publication

Urban Design

A Comprehensive Guide

Post-thesis opportunities and resources.

thesis topics about big data

Urban Design | Landscape| Planning

Join the largest social media community.

thesis topics about big data

STAY UPDATED

Join our whatsapp group.

thesis topics about big data

Recent Posts

thesis topics about big data

What Is an Urban Heat Island?

  • Article Posted: March 28, 2024

thesis topics about big data

What is urban Health?

thesis topics about big data

Top Architecture Thesis Topics for Community Development

  • Article Posted: March 26, 2024

thesis topics about big data

Architecture Thesis Topics for the Digital Age

  • Article Posted: March 25, 2024

thesis topics about big data

15 Inspirational Riverfront Development Case Studies

thesis topics about big data

Future Trends in Architecture Thesis

  • Article Posted: March 24, 2024

thesis topics about big data

Top Urban Design Colleges in India – 2024

  • Article Posted: March 18, 2024

thesis topics about big data

Career Opportunities After B.Arch

  • Article Posted: March 17, 2024

thesis topics about big data

Scholarships for Urban Planning Students 2024

  • Article Posted: February 28, 2024

thesis topics about big data

World Wetlands Day 2024 | The Ramsar Convention

  • Article Posted: February 3, 2024

Green Urban Mobility

Strategies for Green Urban Mobility

  • Article Posted: January 27, 2024

thesis topics about big data

Urban Design: Bridging the Past, Present, and Future

  • Article Posted: January 26, 2024

Sign up for our Newsletter

“Let’s explore the new avenues of Urban environment together “

© 2019 UDL Education Pvt. Ltd. All Rights Reserved.

thesis topics about big data

Privacy Overview

A comprehensive guide (free e-book).

thesis topics about big data

LIBRARIES | ARCH

Data science masters theses.

The Master of Science in Data Science program requires the successful completion of 12 courses to obtain a degree. These requirements cover six core courses, a leadership or project management course, two required courses corresponding to a declared specialization, two electives, and a capstone project or thesis. This collection contains a selection of masters theses or capstone projects by MSDS graduates.

Collection Details

Main Content

Writing a thesis is the final step in obtaining a Bachelor or Master degree. A thesis is always coupled to a scientific project in some field of expertise. Candidates who want to write their thesis in the Big Data Analytics group should, therefore, be interested and trained in a field related to our research areas .

A thesis is an independent, scientific and practical work. This means that the thesis and its related project are conducted exclusively by the candidate; the execution follows proper scientific practices; and all necessary artifacts, algorithms and evaluations have been physically implemented and submitted as part of the thesis. A proper way of sharing code and evaluation artifacts is the creation of a public GitHub repository, which can, then, be referenced in the thesis. The thesis serves as a documentation for the project and as scientific analysis and reflection of the gathered insights.

For students interested in a thesis, we offer interesting topics and a close, continuous supervision during the entire thesis time. Every thesis is supervised by at least one member of our team, who can give advice and help in critical situations. The condensed results of our best master theses have been published at top scientifc venues, such as VLDB, CIKM, EDBT, etc.

A selection of open thesis topics can be found on this page. We also encourage interested students to suggest own ideas in the context of our research areas and to contact individual members of the group directly. An ideal thesis topic is connected in some form to the research projects of a group member. That group member will then become a supervisor for the thesis. Hence, taking a look at the personal pages and our current projects is a good starter for a thesis project. Recent publications on conferences, such as VLDB or SIGMOD , or open research challenges on, for example, Kaggle are good resources for finding interesting thesis ideas.

Organizational information

  • Exposé : Before starting a thesis, Master students have to write a 2-5 pages long exposé. The exposé is a description of the planned project and includes a motivation for the topic, a literature review on related work, a draft of the research/project idea, and a plan for the final evaluation. Please consider our template with initial instructions when starting your exposé. The exposé can be created in the context of the "Selbstständiges wissenschaftliches Arbeiten" module.
  • Timetable : Once the thesis project is started, it must be finished within six months for Master and four months for Bachelor theses. Only special events, such as times of sickness, can extend this period. If you are working on a regular job or if you need to take further courses during your thesis time, the thesis time can be extended as well. A thesis can be started at any time, which is in alignment with semester times but also asynchronous to semester times.
  • Presentations : The work on a Master thesis requires students to give at least two talks. A mid-term talk serves to get some additional feedback from a larger audience and to practice the final thesis defense; this talk is not graded. The final talk is a proper defense of the thesis and the final results; this talk is graded as one part of the academic performance.

Hints for the thesis

  • Length : A typical thesis is 30-60 pages (Bachelor) and 40-90 pages (Master) long.
  • Language : A thesis can be written in German or English. We recommend English, though.
  • Format : We highly recommend writing a thesis in LaTeX, as in this way many structural defects can easily be avoided.
  • Tips for writing a thesis
  • Tips for writing a paper (short)
  • Tips for writing a paper (long)

Bachelor and Master Theses

  • We aim to translate the batch processing-based Sindy algorithms for the discovery of inclusion dependencies with Akka into a reactive, more efficient data profiling approach.
  • We aim to translate the Many algorithm for inclusion dependency discovery on Web Tables into a partializing IND discovery algorithm that is better suited for data integration scenarios.
  • The data profiling language DPQL is a recently developed metadata profiling interface that serves the discovery of complex metadata patterns.
  • We aim to develop efficient profiling approaches that find these metadata patterns as fast as possible.
  • IoT applications, multi-sensor systems and many distributed software systems record time series in different frequencies, temporal alignments, speeds, and formats, which makes their integrated analysis a technically and algorithmically challenging task. We therefore aim to develop a time series engineering library that assists the integration and preparation of time series for analytical tasks, such as anomaly detection, forecasting, clustering etc.
  • As part of the project, we could generate and measure our own times series with different sensors and aggregate the measurements afterwards with the time series library into a single multivariate time series.
  • Based on the movement events of agents in cities, we aim to plan the placement of info-stations, such that these stations inform as many nearby agents as possible in some fixed time period.
  • The project will be conducted in collaboration with the emergenCity project.
  • We will use the streams of movement data and the Lambda engine that is currently in development at the UMR.
  • Keywords: Lambda queries, lattice search
  • Given non-invasive medical sensor measurements, such as heart beats or temperature curves, we aim to find anomalous recordings that may indicate diseases or body malfunctions via modern anomaly detection, clustering and/or prediction techniques for time series.
  • The project will be conducted in collaboration with the VirtualDoc project.
  • Keywords: time series analytics, machine learning
  • In this project, we aim to slice time series into semantically meaningful subsequences. In contrast to traditional sliding or hopping windows, semantic windows should capture variable-lengths concepts, such as hearth beats in ECG data. These subsequences will then support anomaly detection algorithms or clustering algorithms in creating better results.
  • Discovering anomalies in streaming data is a challenging task; hence, we aim to translate batch anomaly detection algorithm(s) into the streaming scenario.
  • Our goal is to discovery anomalies as fast as possibly by sacrificing as little precision as possible.
  • Keywords: stream processing
  • In film scoring, certain visual scenes are accompanied by appropriate sounds; we plan to automate this process with artificial intelligence.
  • Given a database with already scored films, we first extract the scene-to-sound mappings and, then, train a model to learn the scoring process.
  • The project will be conducted in collaboration with a professional film scorer.
  • Keywords: image processing, machine learning
  • First-Line schema matching produces similarity matrices which indicate how likely two attributes of different schemata represent the same semantic concept.
  • Second-Line schema matching consumes similarity matrices and aims to produce improved similarity matrices.
  • There are two main approaches for second-line matching: 1) similarity matrix boosting and 2) ensemble matching. While the former tries to transform a given similarity matrix into a more valuable one, the latter consumes multiple matrices and combines them to a single new similarity matrix.
  • We aim to improve the Hungarian Method by improving its efficiency in exchange for a bit of fuzzyness/approximation (= reduced correctness)
  • Also interesting: Can we allow (to some extend) 1:n and n:m mappings in the attribute matching?
  • Knowledgebases are a valuable source of publicly available data and data integration scenarios. To make these scenarios usable also for relational data integration systems, this project aims to develop a shredding algorithms that translates linked open data into meaningful relational tables for data integration purposes.
  • Data integration test scenarios are very rare, especially if these scenarios should offer special properties, such as join- and unionable tables, unary and complex attributes matches, a broad selection of data types, schema-based and schema-less data, real-world data values and many other properties. This project, therefore, aims to develop a relation decomposer that takes existing, integrated datasets as input and automatically generates different integration scenarios with specific properties from these seed datasets via relational decomposition.
  • The Web Data Commons Crawl is a large dataset of relational tables that stem from crawled HTML Web tables. These tables often store data about same/similar concepts, but they are due to their crawling completely unconnected. Hence, we aim to integrate the WDC commons corpus in a possibly meaningful and correct way, which is both a technically and conceptually challenging task.
  • Data in data lakes is subject to constant changes. Data lakes, thereby, lack most of the control mechanisms that traditional database systems would use to, for example, standardise schemata, maintain indexes or enforce constraints. In this project, we aim to develop a system named lakehouse that dynamically integrates certain parts of a Data Lake to serve certain user-defined queries.
  • The federated learning technique DataGossip proposes to exchange not only model weights, but also some training data items for better convergence on skewed data distributions; we aim to improve this technique with more intelligent training data selection techniques.
  • Keywords: federated learning, distributed computing
  • The BYTE Challenge is a digital learning platform for computer science that targets children from grade 3 to 13.
  • In this project, we aim to assist the platform development and the assessment and curation of digital learning material, which includes videos, quizzes, papers etc.
  • Efficient Partial Inclusion Dependency Discovery
  • Entwicklung einer Chat-KI für Data Engineering
  • Image2Surface: Predicting Surface Properties of Workpieces from Laserscan Images
  • Image2Surface: Data Engineering for Visual Analytics
  • Erkennung anomaler medizinischer Muster – Analyse nicht invasiver medizinischer Daten mittels maschinellen Lernens (2024)
  • Data Generation and Machine Learning in the Context of Optimizing a Twin Wire Arc Spray Process (2023)
  • A Clustering Approach to Column Type Annotation: Effects of Pre-Clustering (2023)
  • Holistische Integration von WebDaten (2023)
  • User-Centric Explainable Deep Reinforcement Learning for Decision Support Systems (2023)
  • Combining Time Series Anomaly Detection Algorithms (2023)
  • DPQLEngine: Processing the Data Profiling Query Language (2023)
  • Aggregating Machine Learning Models for the Energy Consumption Forecast of Heat Generators (2023)
  • Correlation Anomaly Detection in High-Dimensional Time Series (2023)
  • HYPAAD: Hyper Parameter Optimization in Anomaly Detection (2022)
  • Time Series Anomaly Detection: An Aircraft Turbine Case Study (2022)
  • Distributed Duplicate Detection on Streaming Data (2021)
  • UltraMine - Scalable Analytics on Time Series Data (2021)
  • Distributed Graph Based Approximate Nearest Neighbor Search (2020)
  • A2DB: A Reactive Database for Theta-Joins (2020)
  • Distributed Detection of Sequential Anomalies in Time Related Sequences (2020)
  • Efficient Distributed Discovery of Bidirectional Order Dependencies (2020)
  • Distributed Unique Column Combination Discovery (2019)
  • Reactive Inclusion Dependency Discovery (2019)
  • Inclusion Dependency Discovery on Streaming Data (2019)
  • Generating Data for Functional Dependency Profiling (2018)
  • Efficient Detection of Genuine Approximate Functional Dependencies (2018)
  • Efficient Discovery of Matching Dependencies (2017)
  • Discovering Interesting Conditional Functional Dependencies (2017)
  • Multivalued Dependency Detection (2016)
  • DataRefinery - Scalable Offer Processing with Apache Spark (2016)
  • Spinning a Web of Tables through Inclusion Dependencies (2014)
  • Discovery of Conditional Unique Column Combination (2014)
  • Discovering Matching Dependencies (2013)

thesis topics about big data

Google Custom Search

Wir verwenden Google für unsere Suche. Mit Klick auf „Suche aktivieren“ aktivieren Sie das Suchfeld und akzeptieren die Nutzungsbedingungen.

Hinweise zum Einsatz der Google Suche

Technical University of Munich

  • Data Analytics and Machine Learning Group
  • TUM School of Computation, Information and Technology
  • Technical University of Munich

Technical University of Munich

Open Topics

We offer multiple Bachelor/Master theses, Guided Research projects and IDPs in the area of data mining/machine learning. A  non-exhaustive list of open topics is listed below.

If you are interested in a thesis or a guided research project, please send your CV and transcript of records to Prof. Stephan Günnemann via email and we will arrange a meeting to talk about the potential topics.

Graph Neural Networks for Spatial Transcriptomics

Type:  Master's Thesis

Prerequisites:

  • Strong machine learning knowledge
  • Proficiency with Python and deep learning frameworks (PyTorch, TensorFlow, JAX)
  • Knowledge of graph neural networks (e.g., GCN, MPNN)
  • Optional: Knowledge of bioinformatics and genomics

Description:

Spatial transcriptomics is a cutting-edge field at the intersection of genomics and spatial analysis, aiming to understand gene expression patterns within the context of tissue architecture. Our project focuses on leveraging graph neural networks (GNNs) to unlock the full potential of spatial transcriptomic data. Unlike traditional methods, GNNs can effectively capture the intricate spatial relationships between cells, enabling more accurate modeling and interpretation of gene expression dynamics across tissues. We seek motivated students to explore novel GNN architectures tailored for spatial transcriptomics, with a particular emphasis on addressing challenges such as spatial heterogeneity, cell-cell interactions, and spatially varying gene expression patterns.

Contact : Filippo Guerranti , Alessandro Palma

References:

  • Cell clustering for spatial transcriptomics data with graph neural network
  • Unsupervised spatially embedded deep representation of spatial transcriptomics
  • SpaGCN: Integrating gene expression, spatial location and histology to identify spatial domains and spatially variable genes by graph convolutional network
  • DeepST: identifying spatial domains in spatial transcriptomics by deep learning
  • Deciphering spatial domains from spatially resolved transcriptomics with an adaptive graph attention auto-encoder

GCNG: graph convolutional networks for inferring gene interaction from spatial transcriptomics data

Robustness of Large Language Models

Type: Master's Thesis

  • Strong knowledge in machine learning
  • Very good coding skills
  • Proficiency with Python and deep learning frameworks (TensorFlow or PyTorch)
  • Knowledge about NLP and LLMs

The success of Large Language Models (LLMs) has precipitated their deployment across a diverse range of applications. With the integration of plugins enhancing their capabilities, it becomes imperative to ensure that the governing rules of these LLMs are foolproof and immune to circumvention. Recent studies have exposed significant vulnerabilities inherent to these models, underlining an urgent need for more rigorous research to fortify their resilience and reliability. A focus in this work will be the understanding of the working mechanisms of these attacks.

We are currently seeking students for the upcoming Summer Semester of 2024, so we welcome prompt applications. This project is in collaboration with  Google Research .

Contact: Tom Wollschläger

  • Universal and Transferable Adversarial Attacks on Aligned Language Models
  • Attacking Large Language Models with Projected Gradient Descent
  • Representation Engineering: A Top-Down Approach to AI Transparency
  • Mechanistically analyzing the effects of fine-tuning on procedurally defined tasks

Generative Models for Drug Discovery

Type:  Mater Thesis / Guided Research

  • Proficiency with Python and deep learning frameworks (PyTorch or TensorFlow)
  • Knowledge of graph neural networks (e.g. GCN, MPNN)
  • No formal education in chemistry, physics or biology needed!

Effectively designing molecular geometries is essential to advancing pharmaceutical innovations, a domain which has experienced great attention through the success of generative models. These models promise a more efficient exploration of the vast chemical space and generation of novel compounds with specific properties by leveraging their learned representations, potentially leading to the discovery of molecules with unique properties that would otherwise go undiscovered. Our topics lie at the intersection of generative models like diffusion/flow matching models and graph representation learning, e.g., graph neural networks. The focus of our projects can be model development with an emphasis on downstream tasks ( e.g., diffusion guidance at inference time ) and a better understanding of the limitations of existing models.

Contact :  Johanna Sommer , Leon Hetzel

Equivariant Diffusion for Molecule Generation in 3D

Equivariant Flow Matching with Hybrid Probability Transport for 3D Molecule Generation

Structure-based Drug Design with Equivariant Diffusion Models

Efficient Machine Learning: Pruning, Quantization, Distillation, and More - DAML x Pruna AI

Type: Master's Thesis / Guided Research / Hiwi

The efficiency of machine learning algorithms is commonly evaluated by looking at target performance, speed and memory footprint metrics. Reduce the costs associated to these metrics is of primary importance for real-world applications with limited ressources (e.g. embedded systems, real-time predictions). In this project, you will work in collaboration with the DAML research group and the Pruna AI startup on investigating solutions to improve the efficiency of machine leanring models by looking at multiple techniques like pruning, quantization, distillation, and more.

Contact: Bertrand Charpentier

  • The Efficiency Misnomer
  • A Gradient Flow Framework for Analyzing Network Pruning
  • Distilling the Knowledge in a Neural Network
  • A Survey of Quantization Methods for Efficient Neural Network Inference

Deep Generative Models

Type:  Master Thesis / Guided Research

  • Strong machine learning and probability theory knowledge
  • Knowledge of generative models and their basics (e.g., Normalizing Flows, Diffusion Models, VAE)
  • Optional: Neural ODEs/SDEs, Optimal Transport, Measure Theory

With recent advances, such as Diffusion Models, Transformers, Normalizing Flows, Flow Matching, etc., the field of generative models has gained significant attention in the machine learning and artificial intelligence research community. However, many problems and questions remain open, and the application to complex data domains such as graphs, time series, point processes, and sets is often non-trivial. We are interested in supervising motivated students to explore and extend the capabilities of state-of-the-art generative models for various data domains.

Contact : Marcel Kollovieh , David Lüdke

  • Flow Matching for Generative Modeling
  • Auto-Encoding Variational Bayes
  • Denoising Diffusion Probabilistic Models 
  • Structured Denoising Diffusion Models in Discrete State-Spaces

A Machine Learning Perspective on Corner Cases in Autonomous Driving Perception  

Type: Master's Thesis 

Industrial partner: BMW 

Prerequisites: 

  • Strong knowledge in machine learning 
  • Knowledge of Semantic Segmentation  
  • Good programming skills 
  • Proficiency with Python and deep learning frameworks (TensorFlow or PyTorch) 

Description: 

In autonomous driving, state-of-the-art deep neural networks are used for perception tasks like for example semantic segmentation. While the environment in datasets is controlled in real world application novel class or unknown disturbances can occur. To provide safe autonomous driving these cased must be identified. 

The objective is to explore novel class segmentation and out of distribution approaches for semantic segmentation in the context of corner cases for autonomous driving. 

Contact: Sebastian Schmidt

References: 

  • Segmenting Known Objects and Unseen Unknowns without Prior Knowledge 
  • Efficient Uncertainty Estimation for Semantic Segmentation in Videos  
  • Natural Posterior Network: Deep Bayesian Uncertainty for Exponential Family  
  • Description of Corner Cases in Automated Driving: Goals and Challenges 

Active Learning for Multi Agent 3D Object Detection 

Type: Master's Thesis  Industrial partner: BMW 

  • Knowledge in Object Detection 
  • Excellent programming skills 

In autonomous driving, state-of-the-art deep neural networks are used for perception tasks like for example 3D object detection. To provide promising results, these networks often require a lot of complex annotation data for training. These annotations are often costly and redundant. Active learning is used to select the most informative samples for annotation and cover a dataset with as less annotated data as possible.   

The objective is to explore active learning approaches for 3D object detection using combined uncertainty and diversity based methods.  

  • Exploring Diversity-based Active Learning for 3D Object Detection in Autonomous Driving   
  • Efficient Uncertainty Estimation for Semantic Segmentation in Videos   
  • KECOR: Kernel Coding Rate Maximization for Active 3D Object Detection
  • Towards Open World Active Learning for 3D Object Detection   

Graph Neural Networks

Type:  Master's thesis / Bachelor's thesis / guided research

  • Knowledge of graph/network theory

Graph neural networks (GNNs) have recently achieved great successes in a wide variety of applications, such as chemistry, reinforcement learning, knowledge graphs, traffic networks, or computer vision. These models leverage graph data by updating node representations based on messages passed between nodes connected by edges, or by transforming node representation using spectral graph properties. These approaches are very effective, but many theoretical aspects of these models remain unclear and there are many possible extensions to improve GNNs and go beyond the nodes' direct neighbors and simple message aggregation.

Contact: Simon Geisler

  • Semi-supervised classification with graph convolutional networks
  • Relational inductive biases, deep learning, and graph networks
  • Diffusion Improves Graph Learning
  • Weisfeiler and leman go neural: Higher-order graph neural networks
  • Reliable Graph Neural Networks via Robust Aggregation

Physics-aware Graph Neural Networks

Type:  Master's thesis / guided research

  • Proficiency with Python and deep learning frameworks (JAX or PyTorch)
  • Knowledge of graph neural networks (e.g. GCN, MPNN, SchNet)
  • Optional: Knowledge of machine learning on molecules and quantum chemistry

Deep learning models, especially graph neural networks (GNNs), have recently achieved great successes in predicting quantum mechanical properties of molecules. There is a vast amount of applications for these models, such as finding the best method of chemical synthesis or selecting candidates for drugs, construction materials, batteries, or solar cells. However, GNNs have only been proposed in recent years and there remain many open questions about how to best represent and leverage quantum mechanical properties and methods.

Contact: Nicholas Gao

  • Directional Message Passing for Molecular Graphs
  • Neural message passing for quantum chemistry
  • Learning to Simulate Complex Physics with Graph Network
  • Ab initio solution of the many-electron Schrödinger equation with deep neural networks
  • Ab-Initio Potential Energy Surfaces by Pairing GNNs with Neural Wave Functions
  • Tensor field networks: Rotation- and translation-equivariant neural networks for 3D point clouds

Robustness Verification for Deep Classifiers

Type: Master's thesis / Guided research

  • Strong machine learning knowledge (at least equivalent to IN2064 plus an advanced course on deep learning)
  • Strong background in mathematical optimization (preferably combined with Machine Learning setting)
  • Proficiency with python and deep learning frameworks (Pytorch or Tensorflow)
  • (Preferred) Knowledge of training techniques to obtain classifiers that are robust against small perturbations in data

Description : Recent work shows that deep classifiers suffer under presence of adversarial examples: misclassified points that are very close to the training samples or even visually indistinguishable from them. This undesired behaviour constraints possibilities of deployment in safety critical scenarios for promising classification methods based on neural nets. Therefore, new training methods should be proposed that promote (or preferably ensure) robust behaviour of the classifier around training samples.

Contact: Aleksei Kuvshinov

References (Background):

  • Intriguing properties of neural networks
  • Explaining and harnessing adversarial examples
  • SoK: Certified Robustness for Deep Neural Networks
  • Certified Adversarial Robustness via Randomized Smoothing
  • Formal guarantees on the robustness of a classifier against adversarial manipulation
  • Towards deep learning models resistant to adversarial attacks
  • Provable defenses against adversarial examples via the convex outer adversarial polytope
  • Certified defenses against adversarial examples
  • Lipschitz-margin training: Scalable certification of perturbation invariance for deep neural networks

Uncertainty Estimation in Deep Learning

Type: Master's Thesis / Guided Research

  • Strong knowledge in probability theory

Safe prediction is a key feature in many intelligent systems. Classically, Machine Learning models compute output predictions regardless of the underlying uncertainty of the encountered situations. In contrast, aleatoric and epistemic uncertainty bring knowledge about undecidable and uncommon situations. The uncertainty view can be a substantial help to detect and explain unsafe predictions, and therefore make ML systems more robust. The goal of this project is to improve the uncertainty estimation in ML models in various types of task.

Contact: Tom Wollschläger ,   Dominik Fuchsgruber ,   Bertrand Charpentier

  • Can You Trust Your Model’s Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift
  • Predictive Uncertainty Estimation via Prior Networks
  • Posterior Network: Uncertainty Estimation without OOD samples via Density-based Pseudo-Counts
  • Evidential Deep Learning to Quantify Classification Uncertainty
  • Weight Uncertainty in Neural Networks

Hierarchies in Deep Learning

Type:  Master's Thesis / Guided Research

Multi-scale structures are ubiquitous in real life datasets. As an example, phylogenetic nomenclature naturally reveals a hierarchical classification of species based on their historical evolutions. Learning multi-scale structures can help to exhibit natural and meaningful organizations in the data and also to obtain compact data representation. The goal of this project is to leverage multi-scale structures to improve speed, performances and understanding of Deep Learning models.

Contact: Marcel Kollovieh , Bertrand Charpentier

  • Tree Sampling Divergence: An Information-Theoretic Metricfor Hierarchical Graph Clustering
  • Hierarchical Graph Representation Learning with Differentiable Pooling
  • Gradient-based Hierarchical Clustering
  • Gradient-based Hierarchical Clustering using Continuous Representations of Trees in Hyperbolic Space

Best Hadoop Projects

thesis topics about big data

Big Data Thesis Topics

      Big Data Thesis Topics is the beginning point of all your desired achievements. At this scientific paradigm, we are designed our Big Data Thesis Topics for budding students and research academician to get the streamlined and comprehensive their knowledge. We are only working for students and research society with the main hope of fulfill their requirements from the first stage of research topics selection to last stage of viva voce. We deliver our Big Data Thesis Topics Service without any problem in interactive and well-coordinated manner. We assigned our universal celebrated experts for every students or researcher’s projects with the scope of focus mass of scholars individually with the complete domain and uptrend research knowledge. Do you need any support or guidance in Big Data Thesis Topics Selections? You can come towards without any delay.

   Big Data Thesis Topics service is introduced for the purpose of functioning students and research colleagues in Big Data paradigm. Today, managed Hadoop and Spark service uses Google Cloud Dataproc to process big datasets easily in the Apache Big Data ecosystem using powerful and open tools. We give the best training in Cloud Dataproc integration of computer, storage and monitoring service which processed through cloud processing platform.

Why Choose Big Data as a Thesis Topic?

  • To reduce the computation cost
  • Faster and better Decision Making
  • Perform Risk Analysis
  • New Product and Services

Major Applications of using Big Data as a Thesis Topics:

  • Data Virtualization (Data abstraction and DF component)
  • IoT Analytics (Access Data from anywhere)
  • Data Federation (Data integrate from anywhere)
  • Point in Time Analysis (Gather Big Data over a Small Duration)
  • Multi-Voxel Pattern Analysis (Human Brain Decoding and Deep Learning)

One of our Best Thesis Structure in Big Data:

  Table of Contents

-Introduction to the Study

  • Research Questions
  • Empirical Setting
  • Limitations
  • Disposition of the research

-Theoretical Framework

  • Innovation Management
  • Area you focus
  • Implementation of area you focus

-Methodology

  • Research Strategy
  • Research Design
  • Research Method
  • Primary Data Collection
  • Secondary Data Collection
  • Data Analysis
  • Research Quality

-Empirical Findings

  • Key success factors
  • Performance analysis with existing solutions

-Conclusion

  • Recommendations
  • Future Research

-References

-Appendixes

Latest Big Data Thesis Topics :

  • Machine Learning Algorithms and Wearable Technologies for Fall Recognition
  • Korean Morphological Analyzer Construction Using a Grapheme Level Strategy without Linguistic Knowledge
  • Divergence and Convergence on Internet of Things (IoT) Based Manufacturing in Industrial and Academics Interests
  • Symmetric Bisecting K-Means Centers Repositioning for Big Data Clustering to Enhanced Distance Calculation Reduction
  • Reliable Data Movements Using Bandwidth Provision Strategies in Dedicated Networks
  • Hierarchical Change Detection System Based on Scalable Nearest Neighbor for Monitoring Crop
  • Big Bata Analytics Using Artificial Neural Networks for Player’s Patterns Recognition in Cloud Gaming
  • Online Anomaly Detection in Cloud Collaborative Environment for Data Streams Using Non-Parametric Technique
  • Shape Matching for Automated Bow Echo Detection Using Skeleton Context
  • Cloud Computing Leveraging for Grid Responsive Buildings to Non-Intrusive Monitor and Powerful Framework conversion
  • Enhance Maximizing Spread Efficiency for Large Sparse Networks in the Flow Authority Model
  • Hash Neighborhood Candidate Generation and Probabilistic Signature Hash Method on Big Data
  • Automated Extremist Twitter Accounts Classification Using Network Based and Content Based Features
  • Linked Data Paradigm for Connecting API Access and Building Cloud Based Smart Applications with Data Discovery Approaches
  • Adapting for Decomposition of Efficient Parallel PARAFAC Tensor to Data Sparsity in Hadoop

Recent Posts

  • Hadoop related projects
  • Hadoop based projects
  • Hadoop Research Projects
  • Sample Hadoop Projects
  • big data hadoop projects
  • hadoop big data projects
  • hadoop open source projects
  • projects on big data hadoop
  • Projects Based on Hadoop
  • Projects Using Hadoop
  • Projects in Hadoop
  • open source project related to hadoop
  • big data based projects
  • big data projects list
  • interesting big data projects
  • projects on big data
  • big data projects for beginners
  • big data open source projects
  • big data project topics
  • open source big data projects
  • simple big data projects
  • projects based on big data
  • big data real time projects
  • big data research projects
  • big data analysis open source projects
  • big data projects for final year
  • big data mini projects
  • ieee projects on big data
  • ieee big data projects
  • cool big data projects
  • big data student projects
  • project ideas on big data
  • big data ieee projects
  • projects in big data
  • big data related projects
  • big data project titles
  • project topics on big data
  • apache projects for big data
  • projects related to big data
  • dissertation topics on big data
  • phd thesis big data
  • phd thesis on big data analytics
  • thesis on big data analytics
  • projects in big data analytics
  • Projects on Hadoop
  • data analytics projects

Achievements – Hadoop Solutions

Hadoop-Projects-Achievement-Awards

YouTube Channel

thesis topics about big data

Customer Review

5 Star Rating: Recommended

Other Pages

Quick links.

  • Hadoop Projects
  • Big Data Projects
  • Hadoop Thesis
  • MapReduce Project Ideas
  • Big Data Analytics Projects

Support Through

thesis topics about big data

© 2015 HADOOP SOLUTIONS|Theme Developed By Hadoop Solutions | Dissertation project topics

IMAGES

  1. Latest Big Data Master Thesis Topics [Professional PhD Thesis Writers]

    thesis topics about big data

  2. 140 Excellent Big Data Research Topics to Consider

    thesis topics about big data

  3. 214 Big Data Research Topics: Interesting Ideas To Try

    thesis topics about big data

  4. Latest Interesting Big Data Thesis Topics [Novel Research Proposal]

    thesis topics about big data

  5. Recent Trending Big Data Thesis Topics (Top 25 Project Titles)

    thesis topics about big data

  6. Thesis and Research Topics in Big Data

    thesis topics about big data

VIDEO

  1. 10 Finance & 10 Marketing MBA RESEARCH THESIS TOPICS 2024

  2. Big Data Project Use Case

  3. Big Data : les enjeux, les défis

  4. Architecture Thesis Topics: Sustainability #architecture #thesis #thesisproject #design #school

  5. Big data analysis in geoscience

  6. Top 10 Human Resource Thesis research topics research paper

COMMENTS

  1. 214 Big Data Research Topics: Interesting Ideas To Try

    These topics are ideal whether in high school or college. The various errors and uncertainty in making data decisions. The application of big data on tourism. The automation innovation with big data or related technology. The business models of big data ecosystems.

  2. Research Topics & Ideas: Data Science

    Data Science-Related Research Topics. Developing machine learning models for real-time fraud detection in online transactions. The use of big data analytics in predicting and managing urban traffic flow. Investigating the effectiveness of data mining techniques in identifying early signs of mental health issues from social media usage.

  3. 100 Best Big Data Research Topics

    Here are some of the best data mining project topics that learners can consider. Big data mining techniques and tools. Model-based clustering of texts. Describe the concept of data spectroscopic clustering. Parallel spectral clustering within a distributed system.

  4. 37 Research Topics In Data Science To Stay On Top Of » EML

    These topics could be an idea for a thesis or simply topics you can research independently. Stay tuned - this is one blog post you don't want to miss! Table of Contents show 37 Research Topics in Data Science ... Big Data Analytics. These days, it seems like everyone is talking about big data. And with good reason - organizations of all ...

  5. Best Big Data Science Research Topics for Masters and PhD

    Data science thesis topics. We have compiled a list of data science research topics for students studying data science that can be utilized in data science projects in 2022. our team of professional data experts have brought together master or MBA thesis topics in data science that cater to core areas driving the field of data science and big data that will relieve all your research anxieties ...

  6. 10 Best Research and Thesis Topic Ideas for Data Science in 2022

    The best course of action to amplify the robustness of a resume is to participate or take up different data science projects. In this article, we have listed 10 such research and thesis topic ideas to take up as data science projects in 2022. Handling practical video analytics in a distributed cloud: With increased dependency on the internet ...

  7. Top 20 Latest Research Problems in Big Data and Data Science

    Fig 1: 8V's of Big data Courtesy: Elena. Having understood the 8V's of big data, let us look into details of research problems to be addressed. General big data research topics [3] are in the lines of: Scalability — Scalable Architectures for parallel data processing; Real-time big data analytics — Stream data processing of text, image ...

  8. Top 10 Essential Data Science Topics to Real-World Application From the

    1. Introduction. Statistics and data science are more popular than ever in this era of data explosion and technological advances. Decades ago, John Tukey (Brillinger, 2014) said, "The best thing about being a statistician is that you get to play in everyone's backyard."More recently, Xiao-Li Meng (2009) said, "We no longer simply enjoy the privilege of playing in or cleaning up everyone ...

  9. Frontiers in Big Data

    Navigating the Nexus of Big Data, AI, and Public Health: Transformations, Triumphs, and Trials. Dr. Manoj Kumar M V. Prof. Immanuel Azaad Moonesar. Likewin Thomas. 657 views. This innovative journal focuses on the power of big data - its role in machine learning, AI, and data mining, and its practical application from cybersecurity to climate ...

  10. The Applicability of Big Data in Climate Change Research: The

    The aim of this paper is to provide an overview of the interrelationship between data science and climate studies, as well as describes how sustainability climate issues can be managed using the Big Data tools. Climate-related Big Data articles are analyzed and categorized, which revealed the increasing number of applications of data-driven solutions in specific areas, however, broad ...

  11. Masters thesis topics in big data

    1. I am looking for a thesis to complete my master M2, I will work on a topic in the big data's field (creation big data applications), using hadoop/mapReduce and Ecosystem ( visualisation, analysis ...), Please suggest some topics or project that would make for a good masters thesis subject. I add that I have bases in data warehouses ...

  12. PDF Thesis topics for the master thesis Data Science and Business Analytics

    Thesis topics for the master thesis Data Science and Business Analytics Topic 1: Logistic regression for modern data structures Promotor Gerda Claeskens ... However, it will be needed to understand the description of the approach and the big lines. The paper Zhao et al. comes with R software that is used for the numerical results in that paper.

  13. Latest Thesis and Research Topics in Big Data

    The main thesis topics in Big Data and Hadoop include applications, architecture, Big Data in IoT, MapReduce, Big Data Maturity Model etc. Latest Thesis and Research Topics in Big Data. There are a various thesis and research topics in big data for M.Tech and Ph.D. Following is the list of good topics for big data for masters thesis and research:

  14. Top 5 Big Data Master Thesis Topics [Thesis Writing Assistance]

    Massive parallelism, machine learnin g, and AI. High-speed networking and high-performance computation. Hadoop, Spark-based big data analytics technologies. For quantitative, analytical, theoretical, and coding platforms related to all these methodologies, you can approach us for great big data master thesis writing.

  15. Compelling Thesis Topics in the Field of Data Science 2024

    The field of data science in 2024 is characterized by a convergence of cutting-edge technologies and the imperative to address real-world challenges. The ten compelling thesis topics outlined above offer students the opportunity to embark on a journey of exploration and innovation. Whether unravelling the intricacies of deep learning, combating ...

  16. Big data analytics in healthcare: a systematic literature review

    2.1. Characteristics of big data. The concept of BDA overarches several data-intensive approaches to the analysis and synthesis of large-scale data (Galetsi, Katsaliaki, and Kumar Citation 2020; Mergel, Rethemeyer, and Isett Citation 2016).Such large-scale data derived from information exchange among different systems is often termed 'big data' (Bahri et al. Citation 2018; Khanra, Dhir ...

  17. 5 trending PhD research topics in Big Data

    While the maximum number of Big Data research papers is in the field of computer science (171), other academic fields for this line of research include Engineering (75), Mathematics (33), and Business Management (26). Listed below are the 5 trending research topics being pursued by PhD scholars around the globe: Big Data analytics. Big Data ...

  18. 50 Best Thesis Topics For Big Data In Urban Planning

    The term "big data" in urban planning refers to the use of expansive, intricate databases to recognise and effectively handle urban issues. This information can be gathered from a variety of sources, including sensors, social media, and public records. It is frequently analysed using cutting-edge methods like machine learning and data mining.

  19. Data Science Masters Theses // Arch : Northwestern University

    Data Science Masters Theses. The Master of Science in Data Science program requires the successful completion of 12 courses to obtain a degree. These requirements cover six core courses, a leadership or project management course, two required courses corresponding to a declared specialization, two electives, and a capstone project or thesis.

  20. Theses

    Writing a thesis is the final step in obtaining a Bachelor or Master degree. A thesis is always coupled to a scientific project in some field of expertise. Candidates who want to write their thesis in the Big Data Analytics group should, therefore, be interested and trained in a field related to our research areas.

  21. [Research] Master Thesis Topic: Big Data & Business Analytics

    [Research] Master Thesis Topic: Big Data & Business Analytics Hello there. So, currently I'm studying Big Data & Business Analytics for my Masters Degree. I'm near the last semester and slowly the question of topic for the thesis gets relevant. Unfortunately I'm lacking ideas at the moment (I'm really uncreative sadly).

  22. Open Theses

    Open Topics We offer multiple Bachelor/Master theses, Guided Research projects and IDPs in the area of data mining/machine learning. A non-exhaustive list of open topics is listed below.. If you are interested in a thesis or a guided research project, please send your CV and transcript of records to Prof. Stephan Günnemann via email and we will arrange a meeting to talk about the potential ...

  23. Big Data Thesis Topics

    Big Data Thesis Topics Big Data Thesis Topics is the beginning point of all your desired achievements. At this scientific paradigm, we are designed our Big Data Thesis Topics for budding students and research academician to get the streamlined and comprehensive their knowledge. We are only working for students and research society with the main hope of fulfill their requirements from the first ...