A high-severity vulnerability (CVE-2025-46762) has been discovered in Apache Parquet Java, exposing systems using the parquet-avro module to remote code execution (RCE) attacks.
The flaw, disclosed by Apache Parquet contributor Gang Wu on May 2, 2025, impacts versions up to and including 1.15.1.
Technical Breakdown of the Vulnerability
The vulnerability stems from insecure schema parsing in the parquet-avro module. Attackers could embed malicious code within Parquet file metadata, which executes automatically when a vulnerable system reads the file’s Avro schema.
While Apache Parquet 1.15.1 introduced partial mitigations by restricting untrusted packages, its default “trusted packages” configuration still permits code execution from pre-approved Java packages (e.g., java.util).
- Requires use of the “specific” or “reflect“ data models (not the safer “generic” model).
- Vulnerable systems must process attacker-controlled Parquet files.
Affected Systems
- All Apache Parquet Java versions ≤ 1.15.1.
- Applications leveraging parquet-avro for deserialization in big data frameworks like Apache Spark, Hadoop, or Flink.
Mitigation Strategies
The Apache Software Group recommends immediate action:
- Upgrade to Parquet Java 1.15.2, which fully resolves the issue by tightening package trust boundaries.
- For systems stuck on 1.15.1, set the JVM system property:
-Dorg.apache.parquet.avro.SERIALIZABLE_PACKAGES=Â (empty string).
Organizations should also audit data pipelines to ensure the “generic” Avro model is used where possible, as it is immune to this exploit.
Security experts warn that unpatched systems are at risk of supply chain attacks, where corrupted Parquet files trigger backend exploits.
“This is a textbook example of how serialization vulnerabilities can bypass perimeter defenses,” said Maria Chen, CTO of cybersecurity firm DataShield. “Attackers could weaponize common data formats to infiltrate analytics platforms.”
The Apache team has released updated documentation emphasizing secure configuration practices for Avro schema handling.
Organizations handling sensitive data are urged to prioritize patching, as proof-of-concept exploits for similar vulnerabilities often emerge within days of public disclosure.
Find this News Interesting! Follow us on Google News, LinkedIn, & X to Get Instant Updates!