Research

Data Governance Innovation Lab adheres to the win-win development concept and welcomes cooperation with experts in academia and industry in the following research areas. For any queries, contact us at longjiang4@huawei.com.

Research

Data Governance Innovation Lab adheres to the win-win development concept and welcomes cooperation with experts in academia and industry in the following research areas. For any queries, contact us at longjiang4@huawei.com.

  • Intelligent Data Value Exploration Platform

    Traditional data analysis is based on specific service requirements, including data integration, governance, development, and analysis. The future is an era of data-driven innovation. Mining data value and new service scenarios from massive data through uncertain and random data exploration behavior will become the norm. Therefore, we are exploring the random and informative intelligent data exploration platform to help customers discover value.
    Traditional data analysis is based on specific service requirements, including data integration, governance, development, and analysis. The future is an era of data-driven innovation. Mining data value and new service scenarios from massive data through uncertain and random data exploration behavior will become the norm. Therefore, we are exploring the random and informative intelligent data exploration platform to help customers discover value.
  • Next-Generation Intelligent Data Lake Computing Mode Powered by Vector Calculation

    Factors of AI such as feature vectorization, confidence, and probability pose new requirements on data computing and storage. The collision of vector calculation and statistical analysis can guide exploration for the next-generation of big data computing.
    Factors of AI such as feature vectorization, confidence, and probability pose new requirements on data computing and storage. The collision of vector calculation and statistical analysis can guide exploration for the next-generation of big data computing.
  • Intelligent Data Detection, Repair, Association, and Sampling

    Intelligent data quality detection and repair, association, entity merging, sampling, and comprehensive profiling
    Intelligent data quality detection and repair, association, entity merging, sampling, and comprehensive profiling
  • Intelligent Data Asset Management Engine

    Federated metadata management of data assets of public cloud, private cloud, and local data sources; tens of millions of metadata and their relationships, and millisecond-level query performance; unstructured metadata governance, and fuzzy retrieval and recommendation of images, video, and text; real-time metadata system of a data lake for unified metadata management of a big data cluster with more than 20,000 nodes
    Federated metadata management of data assets of public cloud, private cloud, and local data sources; tens of millions of metadata and their relationships, and millisecond-level query performance; unstructured metadata governance, and fuzzy retrieval and recommendation of images, video, and text; real-time metadata system of a data lake for unified metadata management of a big data cluster with more than 20,000 nodes
  • Intelligent Data Security Management Engine

    Full-link security governance: algorithms for various GDPR-compliant data classification and masking scenarios, including data labeling and watermarking
    Full-link security governance: algorithms for various GDPR-compliant data classification and masking scenarios, including data labeling and watermarking
  • Intelligent Data Quality Engine

    Intelligent data quality algorithms: abnormal data detection and repair algorithm,entity merging algorithm,and data column association algorithm; higher than 90% accuracy and recall rate for all datasets; high-performance data quality engine: TB-level data quality in seconds and distributed memory cache and automatic scaling.
    Intelligent data quality algorithms: abnormal data detection and repair algorithm,entity merging algorithm,and data column association algorithm; higher than 90% accuracy and recall rate for all datasets; high-performance data quality engine: TB-level data quality in seconds and distributed memory cache and automatic scaling.
  • Intelligent Model-driven Engine

    Model-driven intelligent data pipeline construction and data asset generation
    Model-driven intelligent data pipeline construction and data asset generation
  • High-Performance Cross-Source Query Optimizer

    Multiple computing engines, such as Hive, Spark, HBase, and MySQL, implementing cross-region and cross-engine scheduling and optimization, and improving performance by over 10 times compared with open-source Rheem and Calcite
    Multiple computing engines, such as Hive, Spark, HBase, and MySQL, implementing cross-region and cross-engine scheduling and optimization, and improving performance by over 10 times compared with open-source Rheem and Calcite
  • Intelligent Hybrid Data Lake Scheduling Engine

    Cross-region data resource scheduling, cross-public cloud and HCS hybrid cloud data resource scheduling, and AI operator scheduling; concurrent scheduling of millions of nodes during peak hours
    Cross-region data resource scheduling, cross-public cloud and HCS hybrid cloud data resource scheduling, and AI operator scheduling; concurrent scheduling of millions of nodes during peak hours
  • Visualized Development Recommendation Engine Based on Machine Learning

    Intelligent industry module recommendation on visualized screens: intelligent template recommendation based on users' industry background; smart assistance optimization on visualized screens: intelligent one-click optimization (intelligent color matching and layout) through machine learning; scenario-based visualized modeling and development platforms, such as 3D city and 3D campus, as well as device-edge-cloud big data input and visualized interaction and presentation
    Intelligent industry module recommendation on visualized screens: intelligent template recommendation based on users' industry background; smart assistance optimization on visualized screens: intelligent one-click optimization (intelligent color matching and layout) through machine learning; scenario-based visualized modeling and development platforms, such as 3D city and 3D campus, as well as device-edge-cloud big data input and visualized interaction and presentation