Skip to content

Commit d6e9fee

Browse files
davidkoppclaude
andcommitted
Add Java Runtime Detector with runtime artifact model
Implements comprehensive Java runtime detection for production environments: * **JavaRuntimeDetector**: Detects JAR files and Java runtime metadata - Scans working directory for .jar files recursively - Extracts versions from JAR manifests and filenames - Collects Java runtime environment information - Generates location-based hashes for change detection * **Runtime Artifact Model**: New schema for deployment artifacts - Uses `type: "runtime"` instead of scope-based model - `artifacts` field contains deployment files (vs dependencies) - `runtime_environment` provides structured platform info - Complements existing dependency model for package managers * **Orchestrator Integration**: Enhanced to handle both models - Supports both `dependencies` and `artifacts` detection - Updated debug output for artifact counting - Maintains backward compatibility * **Comprehensive Testing**: Docker-based integration tests - Tests JAR detection with/without Java runtime - Validates version extraction and metadata collection - Uses eclipse-temurin:17-jre and alpine:latest images * **Complete Documentation**: - Technical detector documentation with examples - Updated project specification with dual models - Test documentation and usage guides Addresses production observability gap where JAR files exist but Maven build tools are unavailable. Establishes foundation for other runtime artifact detectors (Go, Python, Node.js, etc.). 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
1 parent 1eafba6 commit d6e9fee

File tree

9 files changed

+881
-6
lines changed

9 files changed

+881
-6
lines changed

README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,8 @@ python3 -m dependency_resolver -h
3434
- **apk** - System packages of Alpine
3535
- **pip** - Python packages
3636
- **npm** - Node.js packages
37-
- **maven** - Java packages
37+
- **maven** - Java packages (build-time dependencies via pom.xml)
38+
- **java-runtime** - Java packages (runtime JAR files in production environments)
3839

3940
Also captures **Docker container metadata** when analyzing containers.
4041

SPECIFICATION.md

Lines changed: 13 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -66,16 +66,27 @@ The tool outputs structured JSON with detected package managers and their depend
6666
{
6767
"_container-info": { "name": "nginx-container", "image": "nginx:latest", "hash": "sha256:..." },
6868
"dpkg": { "scope": "system", "dependencies": { "curl": { "version": "7.81.0-1ubuntu1.18 amd64" } } },
69-
"pip": { "scope": "project", "location": "/app/venv/lib/python3.12/site-packages", "dependencies": { "numpy": { "version": "1.3.3" } } }
69+
"pip": { "scope": "project", "location": "/app/venv/lib/python3.12/site-packages", "dependencies": { "numpy": { "version": "1.3.3" } } },
70+
"java-runtime": { "type": "runtime", "location": "/app", "artifacts": { "app.jar": { "version": "1.0.0", "size": 12345, "type": "jar" } } }
7071
}
7172
```
7273

7374
For complete JSON schema documentation, field definitions, and examples, see [docs/usage/output-format.md](docs/usage/output-format.md).
7475

7576
### Key Schema Concepts
7677

78+
**Two Detection Models**:
79+
80+
- **Dependency Model**: Uses `scope` field with `dependencies` object (package managers)
81+
- **Runtime Artifact Model**: Uses `type: "runtime"` field with `artifacts` object (deployment files)
82+
83+
**Fields**:
84+
7785
- **Scope**: `"system"` (system-wide packages) or `"project"` (project-specific packages)
78-
- **Location**: Path to project-local installations (project scope only)
86+
- **Type**: `"runtime"` for deployment artifacts
87+
- **Location**: Path to project-local installations or runtime artifacts
88+
- **Dependencies**: Installed packages (dependency model)
89+
- **Artifacts**: Runtime deployment files (runtime model)
7990
- **Hash**: Dependency integrity verification when available
8091
- **Container Info**: Docker metadata included as `_container-info`
8192

TODOS.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ Others:
1818
- Override default commands (e.g., change from `pip list --format=freeze`)
1919
- Make it configurable, if the JSON output is pretty-printed or not (at the start pretty-print is default)
2020
- Add different log levels
21-
- Extend the set of supported package managers with the common ones for Go, PHP and Java
21+
- Extend the set of supported package managers with the common ones for Go, PHP, etc.
2222
- Support more operating systems: RedHat Linux (yum/dnf), openSUSE (zypper)
2323
- Add extraction of some relevant environment variables
2424
- Implement auto-completion of container id / name

dependency_resolver/core/orchestrator.py

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
from ..detectors.apk_detector import ApkDetector
77
from ..detectors.docker_info_detector import DockerInfoDetector
88
from ..detectors.maven_detector import MavenDetector
9+
from ..detectors.java_runtime_detector import JavaRuntimeDetector
910

1011

1112
class Orchestrator:
@@ -25,6 +26,7 @@ def __init__(
2526
DockerInfoDetector(),
2627
DpkgDetector(),
2728
ApkDetector(),
29+
JavaRuntimeDetector(debug=debug),
2830
MavenDetector(debug=debug),
2931
PipDetector(venv_path=venv_path, debug=debug),
3032
NpmDetector(debug=debug),
@@ -73,12 +75,19 @@ def resolve_dependencies(
7375
print(f"Found container info for {detector_name}")
7476
else:
7577
# Standard handling for other detectors
76-
if dependencies.get("dependencies") or self.debug:
78+
# Handle both dependency format and runtime artifact format
79+
has_content = dependencies.get("dependencies") or dependencies.get("artifacts") or self.debug
80+
if has_content:
7781
result[detector_name] = dependencies
7882

7983
if self.debug:
84+
# Count both dependencies and artifacts for debug output
8085
dep_count = len(dependencies.get("dependencies", {}))
81-
print(f"Found {dep_count} dependencies for {detector_name}")
86+
artifact_count = len(dependencies.get("artifacts", {}))
87+
if artifact_count > 0:
88+
print(f"Found {artifact_count} artifacts for {detector_name}")
89+
else:
90+
print(f"Found {dep_count} dependencies for {detector_name}")
8291
else:
8392
if self.debug:
8493
print(f"{detector_name} is not available")
Lines changed: 253 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,253 @@
1+
import hashlib
2+
import os
3+
import re
4+
from typing import Optional, Any
5+
6+
from ..core.interfaces import EnvironmentExecutor, PackageManagerDetector
7+
8+
9+
class JavaRuntimeDetector(PackageManagerDetector):
10+
"""Detector for Java runtime environments with JAR files."""
11+
12+
NAME = "java-runtime"
13+
14+
def __init__(self, debug: bool = False):
15+
self.debug = debug
16+
self._java_available_cache: bool | None = None
17+
self._java_version_cache: dict[str, str] | None = None
18+
19+
def is_usable(self, executor: EnvironmentExecutor, working_dir: Optional[str] = None) -> bool:
20+
"""Check if Java runtime is available or JAR files are present."""
21+
search_dir = working_dir or "."
22+
23+
# Check for JAR files in working directory
24+
stdout, _, exit_code = executor.execute_command(
25+
f"find '{search_dir}' -name '*.jar' -type f | head -1", working_dir
26+
)
27+
if exit_code == 0 and stdout.strip():
28+
return True
29+
30+
# Check for Java runtime availability
31+
return self._java_available(executor, working_dir)
32+
33+
def get_dependencies(self, executor: EnvironmentExecutor, working_dir: Optional[str] = None) -> dict[str, Any]:
34+
"""Extract JAR files and Java runtime information."""
35+
search_dir = working_dir or "."
36+
location = self._resolve_absolute_path(executor, search_dir)
37+
artifacts: dict[str, dict[str, Any]] = {}
38+
39+
result: dict[str, Any] = {"type": "runtime", "location": location}
40+
41+
# Discover JAR files
42+
jar_files = self._discover_jar_files(executor, search_dir)
43+
44+
# Extract metadata from each JAR
45+
for jar_path in jar_files:
46+
jar_info = self._extract_jar_metadata(executor, jar_path, search_dir)
47+
if jar_info:
48+
# Use relative path as artifact name
49+
relative_path = jar_path
50+
if jar_path.startswith(f"{search_dir}/"):
51+
relative_path = jar_path[len(f"{search_dir}/") :]
52+
53+
# Convert size to integer and add artifact type
54+
artifact_info = {
55+
"version": jar_info.get("version", "unknown"),
56+
"size": int(jar_info.get("size", "0")) if jar_info.get("size", "0").isdigit() else 0,
57+
"type": "jar",
58+
"path": jar_path,
59+
}
60+
artifacts[relative_path] = artifact_info
61+
62+
# Add Java runtime environment info if available
63+
java_metadata = self._get_java_metadata(executor, working_dir)
64+
if java_metadata:
65+
runtime_env = {"platform": "java"}
66+
if "java_version" in java_metadata:
67+
runtime_env["version"] = java_metadata["java_version"]
68+
if "java_vendor" in java_metadata:
69+
runtime_env["vendor"] = java_metadata["java_vendor"]
70+
if "java_runtime" in java_metadata:
71+
runtime_env["runtime"] = java_metadata["java_runtime"]
72+
73+
result["runtime_environment"] = runtime_env
74+
75+
# Generate location-based hash if we have artifacts
76+
if artifacts:
77+
result["hash"] = self._generate_location_hash(executor, location)
78+
79+
result["artifacts"] = artifacts
80+
return result
81+
82+
def has_system_scope(self, executor: EnvironmentExecutor, working_dir: Optional[str] = None) -> bool:
83+
"""Java runtime detector is always project scope."""
84+
return False
85+
86+
def _java_available(self, executor: EnvironmentExecutor, working_dir: Optional[str] = None) -> bool:
87+
"""Check if Java runtime is available."""
88+
if self._java_available_cache is not None:
89+
return self._java_available_cache
90+
91+
_, _, exit_code = executor.execute_command("java -version", working_dir)
92+
self._java_available_cache = exit_code == 0
93+
return self._java_available_cache
94+
95+
def _discover_jar_files(self, executor: EnvironmentExecutor, search_dir: str) -> list[str]:
96+
"""Find all JAR files in the working directory and subdirectories."""
97+
stdout, stderr, exit_code = executor.execute_command(
98+
f"find '{search_dir}' -name '*.jar' -type f",
99+
)
100+
101+
if exit_code != 0:
102+
if self.debug:
103+
print(f"ERROR: JAR file discovery failed: {stderr}")
104+
return []
105+
106+
jar_files = []
107+
for line in stdout.strip().split("\n"):
108+
jar_path = line.strip()
109+
if jar_path:
110+
jar_files.append(jar_path)
111+
112+
return jar_files
113+
114+
def _extract_jar_metadata(
115+
self, executor: EnvironmentExecutor, jar_path: str, working_dir: str # pylint: disable=unused-argument
116+
) -> dict[str, str] | None:
117+
"""Extract metadata from a JAR file."""
118+
metadata: dict[str, str] = {}
119+
120+
# Get file size
121+
stdout, _, exit_code = executor.execute_command(f"stat -f%z '{jar_path}' 2>/dev/null || stat -c%s '{jar_path}'")
122+
if exit_code == 0 and stdout.strip():
123+
metadata["size"] = stdout.strip()
124+
125+
# Try to extract version from manifest
126+
version = self._extract_version_from_manifest(executor, jar_path)
127+
if version:
128+
metadata["version"] = version
129+
else:
130+
# Try to extract version from filename
131+
version = self._extract_version_from_filename(jar_path)
132+
if version:
133+
metadata["version"] = version
134+
else:
135+
metadata["version"] = "unknown"
136+
137+
return metadata if metadata else None
138+
139+
def _extract_version_from_manifest(self, executor: EnvironmentExecutor, jar_path: str) -> str | None:
140+
"""Extract version information from JAR manifest."""
141+
# Try to read manifest from JAR
142+
stdout, _, exit_code = executor.execute_command(f"unzip -q -c '{jar_path}' META-INF/MANIFEST.MF 2>/dev/null")
143+
144+
if exit_code != 0 or not stdout.strip():
145+
return None
146+
147+
# Parse manifest for version information
148+
for line in stdout.split("\n"):
149+
line = line.strip()
150+
# Look for common version attributes
151+
for version_key in ["Implementation-Version", "Bundle-Version", "Version", "Specification-Version"]:
152+
if line.startswith(f"{version_key}:"):
153+
version = line.split(":", 1)[1].strip()
154+
if version and version != "null":
155+
return version
156+
157+
return None
158+
159+
def _extract_version_from_filename(self, jar_path: str) -> str | None:
160+
"""Extract version from JAR filename using common patterns."""
161+
filename = os.path.basename(jar_path)
162+
163+
# Remove .jar extension first
164+
if filename.endswith(".jar"):
165+
base_name = filename[:-4]
166+
else:
167+
base_name = filename
168+
169+
# Common patterns: name-version.jar, name_version.jar
170+
# Use more flexible patterns that work with complex names like commons-lang3
171+
patterns = [
172+
r"^(.+)[-_](\d+(?:\.\d+)*(?:-[A-Za-z0-9]+)?)$", # name-1.2.3 or name-1.2.3-SNAPSHOT
173+
r"^(.+)[-_]v(\d+(?:\.\d+)*)$", # name-v1.2.3
174+
r"^(.+?)(\d+(?:\.\d+)*(?:-[A-Za-z0-9]+)?)$", # name1.2.3 (no separator)
175+
]
176+
177+
for pattern in patterns:
178+
match = re.match(pattern, base_name)
179+
if match:
180+
return match.group(2)
181+
182+
return None
183+
184+
def _get_java_metadata(
185+
self, executor: EnvironmentExecutor, working_dir: Optional[str] = None
186+
) -> dict[str, str] | None:
187+
"""Get Java runtime metadata."""
188+
if not self._java_available(executor, working_dir):
189+
return None
190+
191+
if self._java_version_cache is not None:
192+
return self._java_version_cache
193+
194+
stdout, stderr, exit_code = executor.execute_command("java -version", working_dir)
195+
if exit_code != 0:
196+
return None
197+
198+
metadata: dict[str, str] = {}
199+
200+
# Parse java -version output (goes to stderr typically)
201+
version_output = stderr if stderr else stdout
202+
203+
# Extract Java version
204+
version_match = re.search(r'version "([^"]+)"', version_output)
205+
if version_match:
206+
metadata["java_version"] = version_match.group(1)
207+
208+
# Extract vendor/runtime info
209+
lines = version_output.split("\n")
210+
for line in lines:
211+
line = line.strip()
212+
if "Runtime Environment" in line:
213+
# Extract runtime name
214+
runtime_match = re.search(r"([^(]+Runtime Environment)", line)
215+
if runtime_match:
216+
metadata["java_runtime"] = runtime_match.group(1).strip()
217+
elif "VM" in line and "(" in line:
218+
# Extract vendor from VM line
219+
vendor_match = re.search(r"\(([^)]+)\)", line)
220+
if vendor_match:
221+
metadata["java_vendor"] = vendor_match.group(1).strip()
222+
223+
self._java_version_cache = metadata if metadata else None
224+
return self._java_version_cache
225+
226+
def _resolve_absolute_path(self, executor: EnvironmentExecutor, path: str) -> str:
227+
"""Resolve absolute path within the executor's context."""
228+
if path == ".":
229+
stdout, stderr, exit_code = executor.execute_command("pwd")
230+
if exit_code == 0 and stdout.strip():
231+
return stdout.strip()
232+
raise RuntimeError(f"Failed to resolve current directory in executor context: {stderr}")
233+
else:
234+
stdout, stderr, exit_code = executor.execute_command(f"cd '{path}' && pwd")
235+
if exit_code == 0 and stdout.strip():
236+
return stdout.strip()
237+
raise RuntimeError(f"Failed to resolve path '{path}' in executor context: {stderr}")
238+
239+
def _generate_location_hash(self, executor: EnvironmentExecutor, location: str) -> str:
240+
"""Generate a hash based on JAR files and their metadata."""
241+
stdout, stderr, exit_code = executor.execute_command(
242+
f"cd '{location}' && find . -name '*.jar' -type f -printf '%s %p\\n' | LC_COLLATE=C sort -k2,2"
243+
)
244+
245+
if exit_code == 0 and stdout.strip():
246+
content = stdout.strip()
247+
return hashlib.sha256(content.encode()).hexdigest()
248+
else:
249+
if self.debug:
250+
print(f"ERROR: java_runtime_detector hash generation command failed with exit code {exit_code}")
251+
print(f"ERROR: location: {location}")
252+
print(f"ERROR: stderr: {stderr}")
253+
return ""

0 commit comments

Comments
 (0)