Home

Some companies that have chosen us

Privacy Officer and Privacy Consultant
CDP Scheme according to ISO/IEC 17024:2012
European Privacy Auditor
ISDP©10003 Certification Scheme according to ISO/IEC 17065:2012
Auditor
According to standard UNI 11697:2017
Lead Auditor ISO/IEC 27001:2022
According to standard ISO/IEC 17024:2012
Data Protection Officer
According to standard ISO/IEC 17024:2012
Anti-Bribery Lead Auditor Expert
According to standard ISO/IEC 17024:2012
ICT Security Manager
According to standard UNI 11506:2017
IT Service Management (ITSM)
According to the ITIL Foundation
Ethical Hacker (CEH)
According to the EC-Council
Network Defender (CND)
According to the EC-Council
Computer Hacking Forensics Investigator (CHFI)
According to the EC-Council
Penetration Testing Professional (CPENT)
According to the EC-Council

Professional qualifications

Stay up-to-date with world news!

Select your topics of interest:
GOVERNANCE & AWARENESS
Home / GOVERNANCE & AWARENESS
/
Big Data and Data Science
Data Governance

Objectives:

Since the incorporation of Big Data within data warehousing and business intelligence environments, Data Science techniques are being used to provide a ‘windshield’ view of the future of the organisation. Predictive, real-time or model-based, capabilities using different types of data resources give organisations a better inner view of where they are going.

To better exploit Big Data, however, requires a change in the way data is managed. most data warehouses are based on relational models. Big Data are not normally organised in a relational model. Most data warehouses depend on the concept of ETL (Extract, Transform and Load). Big Data solutions, such as data lakes, depend on the concept of ELT, loading and then transformation. Equally important, the speed and volume of data presents challenges that require different types of approaches for critical aspects of data management, such as integration, Metadata Management, and Data Quality assessment.


Activities carried out by our Team:

Defining the strategy and business requirements for Big Data

An organisation’s Big Data strategy must be aligned with and support the overall corporate strategy and business requirements and be part of the data strategy. A Big Data strategy must include criteria to assess:

  • What problems the organisation is trying to solve. What Analytics are used for.
  • Which data sources to use or acquire.
  • The promptness and scope of data provided.
  • The impact and relationship with other data structures.
  • Influences on existing modelled data.

Choosing Data Sources

As with any development project, the choice of data sources for the Data Science work must be guided by the issues that the organisation is trying to solve. The difference with Big Data / Data Science is that the range of data sources is wider; it is not limited by the format and can include data both external and internal to the organisation. The ability to incorporate these data in a solution also carries risks. The quality and reliability of the data must be assessed and a plan put in place for its use over time.

Acquire and Ingest Data Sources

Once identified, sources must be found, sometimes acquired and ingested into the Big Data environment. During this process, critical metadata on the source, e.g. origin, size, status and additional content information, must be acquired. Many ingestion engines profile the data during ingestion, providing analysts with at least partial Metadata. Once the data is in a data lake, it can be evaluated in terms of its suitability for multiple analysis tasks. Since the creation of data science models is an iterative process, so is data ingestion. Gaps in the current data asset base and the onboarding of these sources must be identified iteratively.

Developing hypotheses and methods for data

Data Science is about the creation of answers sets that can find meaning or detailed information within data. The development of Data Science solutions involves the creation of statistical models that identify correlations and trends within and between data elements and data sets. There will be multiple answers to a question based on the inputs of a model.

Integration/alignment of data for analysis

Preparing data for analysis involves understanding the data, finding links between the data from the various sources and aligning the data for use. In many cases, joining data sources is more of an art than a science. One method is to use a model that integrates the data using a common key. Another method is to perform analysis and merge data using indexes within database engines to find similarities and algorithms and methods for linking records. Other methods can find correlations that will be used to compile the results visualisation model. It is necessary to consider using techniques during the initial stages that will help to understand how the model will display the results once published.

Implementation and monitoring

A model that meets the business needs in a feasible manner can be deployed in the production environment for continuous monitoring. Such models require refinement and maintenance. Various modelling techniques are available for implementation. Models can manage batch processes and real-time integration processes messages. A model that meets the business needs in a feasible manner can be deployed in the production environment for continuous monitoring. Such models require refinement and maintenance. Various modelling techniques are available for implementation.

Recommended to you

Big Data and Data Science Data Quality Metadata Management Data Warehousing and Business Intelligence Reference und Master Data Document and Content Management Data Integration and Interoperability Data Security Data Storage and Operations Data Modeling and Design Data Architecture Data Governance