Skip to main content

Main menu

  • Home
  • General
  • Guides
  • Reviews
  • News
  • RCGP
    • BJGP for RCGP members
    • BJGP Open
    • RCGP eLearning
    • InnovAIT Journal
    • Jobs and careers

User menu

  • Subscriptions
  • Alerts
  • Log in

Search

  • Advanced search
Intended for Healthcare Professionals
  • RCGP
    • BJGP for RCGP members
    • BJGP Open
    • RCGP eLearning
    • InnovAIT Journal
    • Jobs and careers
  • Subscriptions
  • Alerts
  • Log in
  • Follow bjgp on BlueSky
  • Visit bjgp on Facebook
  • Blog
  • Listen to BJGP podcast
  • Subscribe BJGP on YouTube
  • Visit bjgp on Instagram
Intended for Healthcare Professionals

Advanced Search

  • HOME
  • ONLINE FIRST
  • CURRENT ISSUE
  • ALL ISSUES
  • AUTHORS & REVIEWERS
  • SUBSCRIBE
  • CONFERENCE
  • MORE
    • About BJGP
    • Advertising
    • eLetters
    • Alerts
    • BJGP LIFE
    • Video
    • Audio
    • Librarian information
    • Resilience
    • COVID-19 Clinical Solutions

Pentaho Data Integration Community ((link))

While the Enterprise Edition has native Hadoop integration, the community has built extensive workarounds. By using a Modified Java Script Value step to call the Hadoop API, or by using the Shell step to run sqoop commands, you can integrate PDI CE with HDFS, Hive, and Spark. There is even a community-maintained "PDI for Big Data" plugin pack.

Read data from CSVs, Excel files, relational databases (MySQL, PostgreSQL), NoSQL (MongoDB), or APIs.

The Pentaho Data Integration Community has been used in a variety of real-world use cases, including:

The community is not just a support forum; it is the R&D department of the open-source ETL world. Here is why it is invaluable: pentaho data integration community

, affectionately known as Kettle , remains one of the world's most widely deployed open-source ETL (Extract, Transform, Load) tools. For nearly two decades, the PDI community has built a robust ecosystem around visual data orchestration, enabling developers to bypass complex coding in favor of a powerful "drag-and-drop" design environment.

The command-line tool used to execute Jobs ( .kjb files).

This public link is valid for 7 days and shares a thread, including any personal information you added. This link or copies made by others cannot be deleted. If you share with third parties, their policies apply. Can’t copy the link right now. Try again later. While the Enterprise Edition has native Hadoop integration,

: "Never lose a Kettle transformation again: Version control for the Community Edition." 4. Advanced Data Orchestration Go beyond simple transformations to complex logic.

At its heart, Pentaho Data Integration (PDI) is a comprehensive, open-source ETL platform. Its primary function is to enable users to visually design, automate, and manage the flow of data from source to target. This involves:

If you're interested in joining the Pentaho Data Integration community, here are some ways to get involved: Read data from CSVs, Excel files, relational databases

Pentaho offers a tiered licensing model to cater to different user needs. Community Edition (CE) Enterprise Edition (EE) Free (LGPL/GPL licenses) Annual Subscription Community-driven (forums/Wiki) Professional support with SLAs Basic Parallel Processing Load Balancing, Clustering, & Data Federation Scheduling Requires external tools or scripts Built-in Automated Scheduler Basic Relational/NoSQL Advanced LDAP/Active Directory Integration Pentaho Data Integration Community Edition - Apix-Drive 1 Aug 2024 —

You don't have to write Java to participate. The community thrives on:

NAVIGATE

  • Home
  • Current Issue
  • All Issues
  • Online First
  • Authors & reviewers

RCGP

  • BJGP for RCGP members
  • BJGP Open
  • RCGP eLearning
  • InnovAiT Journal
  • Jobs and careers

MY ACCOUNT

  • RCGP members' login
  • Subscriber login
  • Activate subscription
  • Terms and conditions

NEWS AND UPDATES

  • About BJGP
  • Alerts
  • RSS feeds
  • Facebook

AUTHORS & REVIEWERS

  • Submit an article
  • Writing for BJGP: research
  • Writing for BJGP: other sections
  • BJGP editorial process & policies
  • BJGP ethical guidelines
  • Peer review for BJGP

CUSTOMER SERVICES

  • Advertising
  • Contact subscription agent
  • Copyright
  • Librarian information

CONTRIBUTE

  • BJGP Life
  • eLetters
  • Feedback

CONTACT US

BJGP Journal Office
RCGP
30 Euston Square
London NW1 2FB
Tel: +44 (0)20 3188 7400
Email:

British Journal of General Practice is an editorially-independent publication of the Royal College of General Practitioners
All Rights Reserved © 2026 Mirror Weekly

Print ISSN: 0960-1643
Online ISSN: 1478-5242