Modern data repositories

Semester
4.
ECTS credits
3 ECTS

Goal

To introduce students to contemporary concepts used in data storage and analysis in business and science, and to train students to use these concepts in practice, especially data warehouses.

Additional info

Modern concepts of digital data management
Massaging of data. Unstructured, semi-structured and structured data repositories. Data quality management. Organizational problems related to data storage and retrieval in data repositories. The power and weakness of relational databases. Modern solutions in storage, retrieval and distribution of large amounts of data.
Production and analytical data repositories in business and science
Organization of digitized data throughout history. Data and metadata. File system for data storage (text and binary files). Organization of data in files (free form, formatted records - HTML, XML, etc.). Database management systems. Data organization models in databases (Flat (tabular), Entity-relation, Hierarchical, Network, Relational, Object, Dimensional, Attribute-relation-value, Non-relational (non-relational, NeSQL), Graphical, Document-oriented, Autonomous, Semantic, Hadoop, Combined, etc.). Database modeling. Data organization in databases with respect to the storage system (centralized system, distributed system, client-server system, parallel system, cloud storage system, mobile databases, etc.). Database life cycle. Databases in business practice. Databases in science (full-text databases, citation databases, bibliographic databases). Use of scientific databases (Wos, Scopus, Croatian Scientific Bibliography, Hrčak, etc.). Knowledge bases. Organization of data and knowledge in the Web environment. Interaction with the database (interface, languages, storage, replication, security and confidentiality, transactivity, migrations, installation, administration, maintenance, tuning of the management system, backups and recovery of lost data, use, optimization of use, etc.). Interests related to data repositories and data analysis. Planning and designing data repositories in accordance with the information needs of the business system. Analytical potential of data repositories.
Data management in the future (the concept of large amounts of data - Big Data)
Big Data concept (directives and parameters that define the concept). Sources of large amounts of data in business. Types of data. Data characteristics (volume, fast access to data, diversity of data sources, variability (data variability over time), veracity - data quality). High-performance computing. Repositories of voluminous data (data warehouses, cloud storage). Technologies related to the concept of managing large amounts of data (Databases, Cloud Computing, Business Intelligence, Data Analysis and Visualization, etc.). Algorithms for processing, analyzing and displaying large amounts of data in real or short time with minimal load on the information and communication infrastructure. Productive and learning potential of large amounts of data. Statistics in the function of processing large amounts of data. Analytical dimension of the concept (Descriptive - what happened, Diagnostic - why it happened, Predictive - what will happen, Prescriptive - what needs to be done). Differentiation of Big Data data analysis from classic data analysis. Planning and implementation of the Big Data analytical process (research objectives, defining questions, defining a strategy for answering questions, collecting data, conducting analysis, repeating the research). Legal aspects of analyzing large amounts of data. Personal data and personal data protection.
Data warehouses as repositories in the function of business intelligence and Big Data concepts
Production and analytical databases. Structured and semi-structured data sources for business analytics. Dimensional data modeling. ETL process. Operational data repository. Business data warehouse. Fact tables. Dimensional tables. Star structure. Snowflake structure. Multidimensional data structures (cubes). Formation principles: top-down and bottom-up. Reporting from data warehouse. (Just-in-time) Business intelligence tools. Analytical reporting services. MDX Queries. Data warehouse data visualization.
1.5. Types of teaching

Lectures: 15
Seminars: 0
Exercises: 15

1. Analyze the information needs of a specific business entity in order to size the data production and data analysis repository and plan the implementation of data analysis.
2. Model a dimensional data repository for business data analysis purposes.
3. Evaluate the storage and analytical usability of available data repositories.

magnifiercrossmenuplus circlecircle-minus LinkedIn facebook Pinterest youtube RSS Twitter Instagram facebook-blank rss-blank linkedin-blank Pinterest youtube Twitter Instagram Skip to content