Java Data Profiling Tools

View 4222 business solutions

Browse free open source Java Data Profiling Tools and projects below. Use the toggles on the left to filter open source Java Data Profiling Tools by OS, license, language, programming language, and project status.

  • Enterprise Job Scheduling Software Icon
    Enterprise Job Scheduling Software

    Unify Enterprise Job Scheduling for Scale, Visibility, and Control

    Managing your sprawling data center and cloud with disparate native schedulers creates chaos. Achieve unparalleled control and efficiency over your entire IT environment with JAMS job orchestration tools. JAMS provides the singular, centralized platform required to overcome the complexities of disparate native schedulers. Automate, secure, and govern all your workloads, eliminating fragmented control, compliance risks, and operational bottlenecks. JAMS streamlines operations and ensures audit-ready history, transforming your enterprise automation with confidence and precision.
    Learn More
  • Peer to Peer Recognition Brings Teams Together Icon
    Peer to Peer Recognition Brings Teams Together

    The modern employee engagement platform for the modern workforce

    Create a positive and energetic workplace environment with Motivosity, an innovative employee recognition and engagement platform. With Motivosity, employees can give each other small monetary bonuses for doing great things, promoting trust, collaboration, and appreciation in the workplace. The software solution comes with features such as an open-currency open-reward system, insights and analytics, dynamic organization chart, award programs, milestones, and more.
    Learn More
  • 1
    DQO Data Quality Operations Center

    DQO Data Quality Operations Center

    Data Quality Operations Center

    DQO is an DataOps friendly data quality monitoring tool with customizable data quality checks and data quality dashboards. DQO comes with around 100 predefined data quality checks which helps you monitor the quality of your data. Table and column-level checks which allows writing your own SQL queries. Daily and monthly date partition testing. Data segmentation by up to 9 different data streams. Build-in scheduling. Calculation of data quality KPIs which can be displayed on multiple built-in data quality dashboards.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 2
    DataCleaner

    DataCleaner

    Data quality analysis, profiling, cleansing, duplicate detection +more

    DataCleaner is a data quality analysis application and a solution platform for DQ solutions. It's core is a strong data profiling engine, which is extensible and thereby adds data cleansing, transformations, enrichment, deduplication, matching and merging. Website: http://datacleaner.github.io
    Downloads: 6 This Week
    Last Update:
    See Project
  • 3
    Open Source Data Quality and Profiling

    Open Source Data Quality and Profiling

    World's first open source data quality & data preparation project

    This project is dedicated to open source data quality and data preparation solutions. Data Quality includes profiling, filtering, governance, similarity check, data enrichment alteration, real time alerting, basket analysis, bubble chart Warehouse validation, single customer view etc. defined by Strategy. This tool is developing high performance integrated data management platform which will seamlessly do Data Integration, Data Profiling, Data Quality, Data Preparation, Dummy Data Creation, Meta Data Discovery, Anomaly Discovery, Data Cleansing, Reporting and Analytic. It also had Hadoop ( Big data ) support to move files to/from Hadoop Grid, Create, Load and Profile Hive Tables. This project is also known as "Aggregate Profiler" Resful API for this project is getting built as (Beta Version) https://sourceforge.net/projects/restful-api-for-osdq/ apache spark based data quality is getting built at https://sourceforge.net/projects/apache-spark-osdq/
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    AMB Data Profiling Data Quality
    AMB New Generation Data Empowerment - offers a comprehensive approach to data governance needs with ground breaking features to locate, identify, discover, manage and protect your overall data infrastructure. Repeatable Process/Exposed Repository.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Effortlessly manage macOS, iOS, iPadOS and tvOS devices across multiple clients and locations. Icon
    Effortlessly manage macOS, iOS, iPadOS and tvOS devices across multiple clients and locations.

    The Most Powerful Apple Device Management Tool for MSPs and IT Teams

    Addigy solutions accelerate Apple adoption in any environment.
    Learn More
  • 5
    Semantic Type Detection

    Semantic Type Detection

    Metadata/data identification Java library

    Metadata/data identification Java library. Identifies Base Type (e.g. Boolean, Double, Long, String, LocalDate, LocalTime, ...) and Semantic Type information (e.g. Gender, Age, Color, Country, ...). Extensive country/language support. Extensible via user-defined plugins. Comprehensive Profiling support. Large set of built-in Semantic Types (extensible via JSON defined plugins). Extensive Profiling metrics (e.g. Min, Max, Distinct, signatures, …) Sufficiently fast to be used inline. See Speed notes below. Minimal false positives for Semantic type detection. See Performance notes below. Usable in either Streaming, Bulk or Record mode. Broad country/language support - including US, Canada, Mexico, Brazil, UK, Australia, much of Europe, Japan and China. Support for sharded analysis (i.e. Analysis results can be merged) Once stream is profiled then subsequent samples can be validated and/or new samples can be generated.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    apache spark data pipeline osDQ

    apache spark data pipeline osDQ

    osDQ dedicated to create apache spark based data pipeline using JSON

    This is an offshoot project of open source data quality (osDQ) project https://sourceforge.net/projects/dataquality/ This sub project will create apache spark based data pipeline where JSON based metadata (file) will be used to run data processing , data pipeline , data quality and data preparation and data modeling features for big data. This uses java API of apache spark. It can run in local mode also. Get json example at https://github.com/arrahtech/osdq-spark How to run Unzip the zip file Windows : java -cp .\lib\*;osdq-spark-0.0.1.jar org.arrah.framework.spark.run.TransformRunner -c .\example\samplerun.json Mac UNIX java -cp ./lib/*:./osdq-spark-0.0.1.jar org.arrah.framework.spark.run.TransformRunner -c ./example/samplerun.json For those on windows, you need to have hadoop distribtion unzipped on local drive and HADOOP_HOME set. Also copy winutils.exe from here into HADOOP_HOME\bin
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    A Data profiling project
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB