Covers: - High-level architecture diagram (AAP / MTA / RHDH / git repos, data flows) - What users see in each surface (MTA Application + Facts + Tags + Comments, RHDH Components with namespaces + annotations + tags + dependsOn) - Entry points (JT 107 / 108 / 98 / 111) with inputs and wall times - Common operations recipes (discovery, pair, thin Step 2, sync, etc.) - Recovery procedures for the failure modes we hit during the sweep (empty Facts, stringification assert, RHDH not refreshing, etc.) - Future enhancements: Argo CD, SBOM, Lightspeed, kai_refine re-enable, namespace backfill for legacy engagements, pom.xml deps as Backstage Resource edges in the dependency graph, thin Step 2 multi-deployment, deferred python_literal removal. |
||
|---|---|---|
| .ansible/collections/ansible_collections/migration/discovery | ||
| .devcontainer | ||
| context | ||
| docs | ||
| eda/rulebooks | ||
| meta | ||
| playbooks | ||
| rhdh | ||
| roles | ||
| tests | ||
| .ansible-lint | ||
| .env.example | ||
| .gitignore | ||
| AGENTS.md | ||
| ansible.cfg | ||
| CHANGELOG.md | ||
| execution-environment.yml | ||
| galaxy.yml | ||
| inventory.ini.example | ||
| README.md | ||
| requirements.yml | ||
migration.discovery -- Agentless Java Application Discovery
An Ansible collection that performs agentless discovery and deep inspection of Java applications on target VMs for migration planning. It identifies running Java application servers, extracts their configurations, introspects WAR/EAR deployments, scores migrate-ability, and produces enterprise-grade reports -- all without installing agents on target hosts.
Scope: Java applications only. Other runtimes (Node.js, Python, .NET, Go) are on the roadmap but not yet implemented.
Documentation Map
| Document | Description |
|---|---|
| CHANGELOG.md | Version history and release notes |
| AGENTS.md | Coding conventions for AI and human contributors |
| Architecture | |
| docs/architecture/ | 9 Mermaid diagrams: system, manifest gen flow, verdict logic, AAP integration, failure modes |
| docs/MTA_TO_CONTAINERFILE.md | Headline pitch: MTA + discovery + LLM → Containerfile (validated hybrid architecture) |
| docs/SBOM.md | CycloneDX SBOM generation via syft; tool selection rationale |
| docs/KAI_INTEGRATION.md | Kai LLM integration; kai_refine role; OpenRouter backend |
| Validation | |
| docs/E2E_PROOF.md | Application-level evidence hierarchy with honest gaps |
| docs/FLEET_VALIDATION.md | 10-VM / 12-app fleet pipeline results (100% LLM coverage via OpenRouter) |
| docs/INTERNAL_VERIFICATION.md | E2E build/deploy/verify results |
| docs/AAP_E2E_VALIDATION.md | AAP Job Template + Workflow validation |
| docs/MTA_ANALYZER_VALIDATION.md | MTA analyzer recipe + PetClinic submission |
| Quality | |
| docs/BLIND_RUN_AUDIT.md | Hardcoded assumption audit with severity ratings |
| docs/ANSIBLE_REVIEW.md | Ansible review findings + mitigations |
| docs/SECURITY.md | Validator account lifecycle, secret hygiene, LLM redaction rules |
| docs/OPENROUTER_MIGRATION.md | LiteLLM → OpenRouter migration record |
| Test artifacts | |
| docs/TEST_FLEET.md | Test fleet VM patterns and provisioning status |
| docs/fixtures/ | Reference JSON fixtures (fleet report, MTA partial) |
| Runbooks | |
| docs/runbooks/ | Operational runbooks for common workflows |
Step 1: Migration Readiness Report
Start here. Before any migration work begins, run the fleet report against your target VMs. This produces a customer-facing HTML report with Green/Yellow/Red verdicts for every host in the estate.
# Create an inventory of target VMs
cat > inventory.ini <<EOF
[all]
vm1 ansible_host=10.0.1.10
vm2 ansible_host=10.0.1.11
vm3 ansible_host=10.0.1.12
[all:vars]
ansible_user=deploy
ansible_become=true
EOF
# Run the Migration Readiness Assessment
ansible-playbook -i inventory.ini playbooks/report-fleet.yml \
-e fleet_report_customer_name="Acme Corp" \
-e fleet_report_engagement_name="Java Migration Assessment"
Output:
migration-readiness-report.html-- Self-contained HTML report (Red Hat branded, print-friendly)fleet-report-data.json-- Structured JSON data reusable as input to the manifest-generation pipeline
The report includes:
- Executive summary with Green/Yellow/Red donut and runtime breakdown
- Per-host detail cards with JVM config, deployments, JDBC connections, keystores, secrets
- Risk register aggregated across the fleet
- Wave-based recommendations (Green first, then Yellow, then Red for architecture review)
- Migration checklists per host
The same JSON data feeds directly into the generate_manifests role to produce Containerfiles, Kubernetes Deployments, Services, NetworkPolicies, and ExternalSecrets.
AAP Integration
Register as a Job Template in Ansible Automation Platform:
| Field | Value |
|---|---|
| Playbook | playbooks/report-fleet.yml |
| Extra Variables | fleet_report_customer_name, fleet_report_engagement_name |
| Inventory | Target VMs |
| Credentials | Machine credential with SSH + become |
The HTML report is saved as a job artifact viewable in the AAP UI.
What This Collection Does
Point this collection at any Linux VM (via SSH) and it will:
- Inventory the OS, hardware, packages, services, processes, and listening ports
- Discover all installed Java environments (JDK/JRE locations, versions, vendors)
- Match findings against a library of Java application detector definitions
- Extract configuration files and runtime details for each detected application
- Perform deep inspection: JVM analysis, keystore enumeration, JDBC parsing, WAR introspection, build file analysis
- Score each host for migrate-ability (Green / Yellow / Red)
- Produce per-host JSON reports, a fleet HTML report, and an MTA handoff section for downstream migration tooling
Supported Java application detectors:
| Application | Category | Key Detection Signals | Status |
|---|---|---|---|
| Apache Tomcat | Application Server | catalina process, port 8080/8443 | Validated |
| WildFly / JBoss EAP | Application Server | jboss-modules.jar, port 9990 | Validated |
| Spring Boot | Application Server | JarLauncher process, spring .jar | Experimental |
| Oracle WebLogic | Application Server | weblogic.Server, port 7001/7002 | Experimental |
| IBM WebSphere Traditional | Application Server | WsServer, port 9080/9443 | Experimental |
| IBM WebSphere Liberty | Application Server | ws-server.jar, port 9080/9443 | Experimental |
Experimental detectors have detection patterns defined but have not been validated against a live instance. Use at your own risk and verify results manually.
Architecture
Layer 1: GATHER Layer 2: DETECT+EXTRACT Layer 3: DEEP INSPECT Layer 4: REPORT
+---------------+ +--------------------+ +---------------------+ +--------------+
| gather_facts | --> | detect_apps | --> | deep_inspect | ->| report |
| | | extract_config | | | | fleet_report |
| - OS/hardware | | - Load detectors | | - JVM analysis | | gen_manifests |
| - packages | | - Pattern matching | | - Java version | | |
| - services | | - Confidence score | | - Keystores | | - JSON per-host |
| - processes | | - Config reading | | - JDBC connections | | - HTML fleet |
| - ports | | - Version detect | | - WAR introspection | | - G/Y/R scoring |
| - Java envs | | - Sensitive redact | | - Build file analysis| | - Containerfile |
| - filesystem | | | | - Log config | | - K8s manifests |
+---------------+ +--------------------+ | - Network conns | +-----------------+
| - Env vars, cron |
| - System packages |
| - Secrets detection |
+---------------------+
Role: gather_facts
Collects raw data from the target host using Ansible built-in modules (setup, package_facts, service_facts) and shell commands (ps auxww, ss -tlnp, find). Discovers all installed Java environments by scanning /usr/lib/jvm, /usr/java, /opt/java, and alternatives. All results are stored under the discovered_facts namespace.
Role: detect_apps
Loads YAML detector definitions from roles/detect_apps/files/detectors/ and matches them against gathered facts using five methods:
- Process patterns -- regex against running process command lines
- Package patterns -- regex against installed package names
- Service patterns -- regex against registered services
- Port patterns -- exact match against listening port numbers
- Filesystem patterns -- glob-to-regex against discovered directories and marker files
Confidence is scored by how many methods matched: high (3+), medium (2), low (1).
Role: extract_config
For each detected application, this role:
- Determines the application home directory from process arguments or filesystem paths
- Reads configuration files (capped at 1MB each) with sensitive-field redaction
- Captures JVM arguments, environment variables, and specific listening ports
- Runs version detection commands
Role: deep_inspect
For each detected Java application, performs deep analysis:
- JVM analysis: Parses full command line for -Xms, -Xmx, -XX:* flags, -D system properties, classpath, -javaagent entries, and GC algorithm
- Java version: Runs
java -versionfrom the process's actual java binary (from /proc/PID/cmdline and /proc/PID/exe) - Keystores: Finds *.jks, *.p12, *.pfx, *.keystore files under app home and JVM arg paths. For each, runs
keytool -list -vto extract alias, DN, and expiry (never key material) - JDBC connections: Parses config files for JDBC URLs -- extracts host, port, database, driver
- System packages:
rpm -qaordpkg -lfiltered by java, jdk, tomcat, jboss, etc. - Environment variables: Captures JAVA_HOME, CATALINA_HOME, JBOSS_HOME, WAS_HOME, MW_HOME, CLASSPATH from /proc/PID/environ, setenv.sh, and systemd units
- Cron jobs: Parses crontabs for anything referencing the app
- Log configuration: Finds log4j.properties, log4j2.xml, logback.xml, logging.properties -- extracts log file paths
- Network connections:
ss -tnpfor the app's PID -- external hosts/ports it connects to - WAR/EAR introspection: Lists deployments (capped at 20), extracts web.xml (servlets, filters, listeners), Maven coordinates from pom.properties, MANIFEST.MF, WEB-INF/lib jar inventory, Spring Boot and JPA detection
- Build file analysis: Parses pom.xml and build.gradle for dependencies, frameworks (Spring Boot, Java EE, Jakarta EE), database drivers, and messaging libraries
- Secrets detection: Scans config files for known secret patterns (keystorePass, JDBC passwords, etc.)
Role: report
Renders a Jinja2 template into a JSON report at {{ output_dir }}/{{ inventory_hostname }}.json and prints a human-readable summary to stdout.
Role: fleet_report
Collects discovery data from all hosts in the play, merges deep inspection results into each detected application, and scores every host for migration readiness:
- Green -- Ready to containerize (0-2 migration flags)
- Yellow -- Requires targeted manual work (3-5 flags: secrets, JDBC, keystores)
- Red -- Needs architecture review (6+ flags: complex runtimes, clustering)
- Gray -- No Java applications detected
Produces:
- Self-contained HTML report (
migration-readiness-report.html) with executive summary, per-host detail cards, risk register, and wave-based recommendations - Structured JSON (
fleet-report-data.json) reusable as input togenerate_manifests
Role: generate_manifests
Generates enterprise-grade OpenShift manifests from discovery data:
- Containerfile -- UBI9 base, non-root USER 185, OCI labels
- Deployment -- Liveness/readiness probes, security context (restricted-v2), resource limits from JVM heap, OTEL annotations
- Service -- ClusterIP for discovered ports
- Route -- TLS edge or passthrough (based on keystore detection)
- ConfigMap -- JVM opts, system properties
- ExternalSecret -- Vault references for discovered secrets
- NetworkPolicy -- Ingress/egress rules from discovered connections
- Kustomization --
oc apply -kbundle
Role: kai_refine
LLM-assisted Containerfile refinement. Takes the deterministic Containerfile produced by generate_manifests and uses an LLM (via Kai) to suggest improvements: missing system packages, multi-stage build optimization, layer ordering, and security hardening. Toggle with use_kai_containerfile: true|false (default false). See docs/KAI_INTEGRATION.md for details.
Role: drift_detect
Detects configuration drift between the source VM and its containerized counterpart. Compares JVM args, environment variables, ports, JDBC connections, and system packages. Produces a drift report highlighting differences that may indicate migration regressions.
Run via the dedicated playbook:
ansible-playbook -i inventory.ini playbooks/detect-drift.yml \
-e drift_detect_source_host=legacy-tomcat \
-e drift_detect_container_ns=test-fleet-monolith
Usage
Prerequisites
- Ansible 2.12+ on the control node
- syft (optional, for SBOM generation):
curl -sSfL https://raw.githubusercontent.com/anchore/syft/main/install.sh | sh - SSH access to target hosts
become(sudo) privileges on target hosts for reading config files and /proc
Install the collection
# From the repository
ansible-galaxy collection install git+https://git.arsalan.io/anaeem/ansible-collection-discovery.git
# Or from a local checkout
cd /path/to/ansible-collection-discovery
ansible-galaxy collection build
ansible-galaxy collection install migration-discovery-2.0.0.tar.gz
Run discovery
# Against a single host
ansible-playbook -i "target-vm," playbooks/discover.yml
# Against an inventory
ansible-playbook -i inventory.ini playbooks/discover.yml
# With custom output directory
ansible-playbook -i inventory.ini playbooks/discover.yml -e output_dir=/opt/reports
# Limit to specific hosts
ansible-playbook -i inventory.ini playbooks/discover.yml --limit webservers
Build and deploy
After discovery and manifest generation, use the build-and-deploy playbook to build the container image and deploy to OpenShift:
ansible-playbook playbooks/build-and-deploy.yml \
-e app_name=classic-monolith \
-e source_host=172.16.2.44 \
-e target_namespace=test-fleet-monolith
Detect drift
After deploying, run drift detection to compare source VM and container:
ansible-playbook -i inventory.ini playbooks/detect-drift.yml \
-e drift_detect_source_host=legacy-tomcat \
-e drift_detect_container_ns=test-fleet-monolith
AAP Integration (Full Path)
For Ansible Automation Platform users, the recommended setup is:
| AAP Resource | Details |
|---|---|
| Project | SCM pointing to this collection's Git repository, branch main, SCM update on launch |
| Job Template 1 | "Step 1 -- Migration Readiness Report" using playbooks/report-fleet.yml with survey for customer_name and engagement_name |
| Job Template 2 | "Step 2 -- Discovery + Manifest Generation" using playbooks/discover.yml |
| Workflow Template | Chain JT1 (fleet report) on success into JT2 (manifest generation) for a full pipeline run |
| Inventory | Target VMs with Machine credential (SSH + become) |
The fleet report HTML is exposed as a job artifact via set_stats. See docs/AAP_E2E_VALIDATION.md for validated results.
Inventory example
[java_servers]
tomcat01.example.com
jboss01.example.com
weblogic01.example.com
[all:vars]
ansible_user=deploy
ansible_become=true
How to Add a New Detector
Adding a new Java application detector requires only creating a single YAML file in roles/detect_apps/files/detectors/. No code changes are needed. The file is loaded automatically on the next playbook run.
Detector Schema Reference
Every field explained:
# ==============================================================
# REQUIRED FIELDS
# ==============================================================
name: "Human Readable Name"
# Display name shown in reports and logs.
# Example: "Apache Tomcat", "Oracle WebLogic"
id: "snake_case_id"
# Unique identifier used as dictionary keys in Ansible facts.
# Must be unique across all detectors. Use lowercase with underscores.
# Example: "tomcat", "jboss_wildfly", "websphere_liberty"
category: "application_server"
# Classification for grouping in reports.
# Values: application_server, web_server, database, runtime, middleware, monitoring
# ==============================================================
# DETECTION RULES (at least one section should have entries)
# ==============================================================
detect:
processes:
# List of regex patterns matched against `ps auxww` command lines.
# Each entry has a single "pattern" key containing a regex string.
# Backslash-escape dots in package names: "org\\.apache" not "org.apache"
- pattern: "org\\.apache\\.catalina\\.startup\\.Bootstrap"
- pattern: "-Dcatalina\\.home="
packages:
# List of regex patterns matched against installed RPM/DEB package names.
- pattern: "^tomcat"
- pattern: "^java.*openjdk"
services:
# List of regex patterns matched against systemd/sysvinit service names.
- pattern: "tomcat"
ports:
# List of TCP port numbers (integers) matched against listening ports.
# Exact match only -- no ranges.
- 8080
- 8443
filesystem:
# List of glob patterns matched against discovered directory paths.
# Supports * wildcard. Matched against /opt, /usr/local, /var/lib,
# /home, /srv, /app, /u01 (configurable via scan_dirs).
- "/opt/tomcat*"
- "*/apache-tomcat-*"
- "*/bin/catalina.sh"
# ==============================================================
# VERSION DETECTION
# ==============================================================
version_command: "{app_home}/bin/version.sh 2>/dev/null | grep 'Server version'"
# Shell command to determine the application version.
# {app_home} is replaced with the detected application home directory at runtime.
# Should output a single line containing the version string.
# Use 2>/dev/null to suppress errors, and always end with a fallback.
# ==============================================================
# CONFIGURATION FILES
# ==============================================================
config_files:
# List of configuration files to read when this application is detected.
- name: "server_xml"
# Identifier used as dictionary key in the report. Use snake_case.
path: "{app_home}/conf/server.xml"
# File path. {app_home} is replaced at runtime.
- name: "tomcat_users_xml"
path: "{app_home}/conf/tomcat-users.xml"
sensitive: true
# When true, passwords/secrets in this file are redacted with [REDACTED]
# before being stored in the report.
- name: "conf_dir"
path: "{app_home}/conf.d/"
directory: true
# When true, lists directory contents instead of reading file content.
# ==============================================================
# HOME DIRECTORY DETECTION
# ==============================================================
home_detection:
process_arg: "-Dcatalina.home="
# JVM system property or command-line argument that contains the app home path.
# Extracted from the process command line via grep.
process_arg_alt: "-Dcatalina.base="
# Fallback argument to try if process_arg yields nothing.
default_home: "/opt/tomcat"
# Static fallback path if neither process argument is found.
default_home_alt: "/usr/share/tomcat"
# Second static fallback path.
env_vars:
# Environment variables that may contain the home path.
# Checked in /proc/PID/environ during deep inspection.
- CATALINA_HOME
- CATALINA_BASE
# ==============================================================
# DEEP INSPECTION FLAGS (optional)
# ==============================================================
war_introspection: true
# When true, the deep_inspect role will look for WAR/EAR files in the
# deployment directory and introspect each one (web.xml, pom.properties,
# MANIFEST.MF, WEB-INF/lib, Spring Boot detection, JPA detection).
# Capped at 20 deployments.
deployment_types:
# List of deployment artifact types this app server supports.
# Used to guide WAR/EAR introspection.
- war
- ear
- jar
secrets_in_config:
# List of "filename:attribute" pairs that indicate where secrets live
# in this application's configuration files. The deep_inspect role will
# check if these exist and flag them in the report.
- "server.xml:keystorePass"
- "context.xml:password"
build_files:
# List of build file names to look for in the app home directory.
# When found, the deep_inspect role parses them for dependencies,
# frameworks, database drivers, and messaging libraries.
- pom.xml
- build.gradle
- build.gradle.kts
# ==============================================================
# VERSION DETECTION METADATA (optional, used by deep_inspect)
# ==============================================================
version_detection:
file: "registry.xml"
# File relative to app_home that contains version information.
alt_file: "lib/weblogic.jar"
# Alternate location for version info.
method: "manifest"
# How to extract version: "manifest" (JAR MANIFEST.MF), "xml_element"
# (XML tag), "properties" (Java properties file).
Example: adding an Apache Kafka detector
Create roles/detect_apps/files/detectors/kafka.yml:
name: Apache Kafka
id: kafka
category: middleware
detect:
processes:
- pattern: "kafka\\.Kafka"
- pattern: "kafka-server-start"
packages:
- pattern: "kafka"
services:
- pattern: "kafka"
ports:
- 9092
- 9093
filesystem:
- "/opt/kafka*"
- "*/kafka/config"
version_command: "{app_home}/bin/kafka-server-start.sh --version 2>/dev/null | head -1"
config_files:
- name: server_properties
path: "{app_home}/config/server.properties"
- name: log4j_properties
path: "{app_home}/config/log4j.properties"
home_detection:
process_arg: "-Dkafka.logs.dir="
default_home: "/opt/kafka"
That is all. The next time the playbook runs, Kafka will be included in detection.
How Java Apps Are Configured in the Wild
Java application servers share common patterns that this collection captures:
JVM Configuration: Heap sizes (-Xms/-Xmx), garbage collector selection (-XX:+UseG1GC), system properties (-Djava.io.tmpdir), and Java agents (-javaagent:) are passed on the command line. These are critical for capacity planning during migration.
Configuration Files: Each app server has its own configuration layout (server.xml for Tomcat, standalone.xml for JBoss, config.xml for WebLogic) but they all define similar things: datasources, thread pools, security realms, and clustering. Passwords and connection strings live in these files.
Deployments: Applications are packaged as WAR (web) or EAR (enterprise) archives and dropped into a deployment directory. Each WAR contains WEB-INF/web.xml (servlet mappings), WEB-INF/lib/ (dependencies), and optionally Maven metadata and Spring Boot configuration.
Keystores: TLS certificates are stored in Java KeyStore (.jks) or PKCS12 (.p12) files. The keystore path and password are referenced in server configuration. Many production systems still use the default password changeit.
External Dependencies: Applications connect to databases (via JDBC), message queues (JMS/Kafka/RabbitMQ), and caches. These connections are defined in server config files or application properties and represent dependencies that must be available after migration.
Environment Injection: Configuration is often injected via setenv.sh (Tomcat), standalone.conf (JBoss), setDomainEnv.sh (WebLogic), or systemd unit Environment directives. These scripts set JAVA_HOME, JAVA_OPTS, CATALINA_OPTS, and custom properties.
MTA Integration Boundary
Each application in the report includes an mta_handoff section that clearly delineates responsibilities:
This collection discovers (infrastructure level):
- Runtime JVM configuration (heap, GC, system properties, agents)
- Listening ports and external network connections
- Keystores with certificate details
- JDBC datasource URLs and drivers
- System packages (RPM/DEB)
- Environment variables and setenv scripts
- Cron jobs referencing the application
- Log configuration and file paths
- WAR/EAR contents (web.xml, dependencies, Maven coordinates)
- Secrets in configuration files
MTA handles (code level):
- Source code analysis and API compatibility
- Deprecated API detection (javax -> jakarta, etc.)
- Framework migration rules
- Dockerfile and container manifest generation
- Dependency vulnerability scanning
- Code-level refactoring recommendations
Artifacts to pass to MTA:
- WAR/EAR files listed in
mta_handoff.artifacts_for_mta - Source repositories (if pom.xml/build.gradle found at
build_info.build_files) - Configuration files listed in
mta_handoff.config_for_mta
Example Output
{
"host": "tomcat01.example.com",
"scan_timestamp": "2026-04-10T14:30:00Z",
"scan_version": "2.0.0",
"scope": "java",
"os": {
"distribution": "Red Hat Enterprise Linux",
"distribution_version": "8.9",
"os_family": "RedHat",
"kernel": "4.18.0-513.el8.x86_64",
"architecture": "x86_64"
},
"hardware": {
"vcpus": 4,
"memory_mb": 16384,
"swap_mb": 2048
},
"java_environments": [
{
"java_home": "/usr/lib/jvm/java-21-openjdk-21.0.10.0.7-1.el8.x86_64",
"version": "21.0.10",
"vendor": "Red Hat"
},
{
"java_home": "/usr/lib/jvm/java-11-openjdk-11.0.25.0.9-3.el8.x86_64",
"version": "11.0.25",
"vendor": "Red Hat"
}
],
"discovered_applications": [
{
"name": "Apache Tomcat",
"id": "tomcat",
"category": "application_server",
"version": "Server version: Apache Tomcat/9.0.93",
"confidence": "high",
"detection_methods": ["process", "service", "port", "filesystem"],
"home_path": "/opt/tomcat",
"jvm": {
"java_version": "21.0.10",
"java_home": "/usr/lib/jvm/java-21-openjdk",
"java_vendor": "Red Hat",
"heap_min": "512m",
"heap_max": "2048m",
"gc_algorithm": "G1GC",
"system_properties": {
"catalina.home": "/opt/tomcat",
"catalina.base": "/opt/tomcat",
"java.io.tmpdir": "/opt/tomcat/temp",
"java.util.logging.config.file": "/opt/tomcat/conf/logging.properties"
},
"jvm_agents": [],
"xx_flags": ["-XX:+UseG1GC", "-XX:MaxGCPauseMillis=200"]
},
"ports": [8080, 8443],
"config_files": {
"server_xml": {"path": "/opt/tomcat/conf/server.xml", "exists": true, "size": 7542},
"context_xml": {"path": "/opt/tomcat/conf/context.xml", "exists": true, "size": 1234}
},
"deployments": [
{
"name": "myapp.war",
"type": "war",
"size_bytes": 45678912,
"maven_coordinates": {
"groupId": "com.example",
"artifactId": "myapp",
"version": "2.1.0"
},
"dependencies": [
"spring-core-5.3.30.jar",
"spring-web-5.3.30.jar",
"hibernate-core-5.6.15.Final.jar",
"postgresql-42.6.0.jar",
"logback-classic-1.2.12.jar",
"slf4j-api-1.7.36.jar"
],
"web_xml": {
"servlets": ["dispatcherServlet"],
"filters": ["encodingFilter", "springSecurityFilterChain"],
"listeners": ["org.springframework.web.context.ContextLoaderListener"]
},
"spring_boot": true,
"jpa_configured": true
}
],
"jdbc_connections": [
{
"source_file": "/opt/tomcat/conf/context.xml",
"url": "jdbc:postgresql://db.internal:5432/appdb",
"host": "db.internal",
"port": "5432",
"database": "appdb",
"driver": "org.postgresql.Driver"
}
],
"keystores": [
{
"path": "/opt/tomcat/conf/keystore.jks",
"type": "JKS",
"password_is_default": true,
"aliases": ["tomcat"]
}
],
"external_connections": [
{"remote_host": "db.internal", "remote_port": 5432, "protocol": "tcp"},
{"remote_host": "redis.internal", "remote_port": 6379, "protocol": "tcp"}
],
"log_config": {
"type": "logback",
"config_files": ["/opt/tomcat/webapps/myapp/WEB-INF/classes/logback-spring.xml"],
"log_paths": ["/var/log/myapp/application.log"]
},
"system_packages": ["java-21-openjdk-21.0.10.0.7-1.el8.x86_64", "tomcat-native-1.2.39-1.el8.x86_64"],
"environment_vars": {
"CATALINA_HOME": "/opt/tomcat",
"CATALINA_BASE": "/opt/tomcat",
"JAVA_HOME": "/usr/lib/jvm/java-21-openjdk",
"CATALINA_OPTS": "-Xms512m -Xmx2048m -XX:+UseG1GC"
},
"cron_jobs": [],
"secrets_found": ["server.xml:keystorePass:/opt/tomcat/conf/server.xml"],
"migration_flags": {
"keystore_outside_app_home": false,
"external_path_references": ["/opt/tomcat/temp", "/opt/tomcat/conf/logging.properties"],
"session_persistence_configured": false,
"clustering_configured": false,
"jndi_datasources": 1,
"custom_class_loader": false
},
"mta_handoff": {
"analysis_target": "containerization",
"artifacts_for_mta": ["myapp.war"],
"config_for_mta": ["/opt/tomcat/conf/server.xml", "/opt/tomcat/conf/context.xml"],
"this_collection_discovers": ["Runtime JVM configuration...", "..."],
"mta_handles": ["Source code analysis...", "..."],
"note": "MTA handles: dependency analysis, API compatibility, code-level migration issues. This collection handles: infrastructure-level discovery, runtime config, secrets, system packages."
}
}
],
"summary": {
"total_apps": 1,
"high_confidence": 1,
"medium_confidence": 0,
"low_confidence": 0,
"categories": {"application_server": 1},
"total_java_environments": 2
}
}
Roadmap
Future runtime support (not yet implemented):
- Node.js / TypeScript applications
- Python web applications (Django, Flask, FastAPI)
- .NET / .NET Core applications
- Go applications
- Ruby on Rails
Design Decisions
- Java-only scope -- focused depth over breadth. Deep JVM inspection, WAR introspection, and build analysis provide migration-critical data that shallow multi-runtime scanning cannot.
- ignore_errors: true on all data-gathering tasks -- VMs in the wild have missing commands, restricted permissions, and non-standard layouts. The collection should always produce a report, even if partial.
- ANSI stripping on all shell output -- terminals on Fedora/RHEL can inject escape codes into piped output via
grepaliases. - Sensitive file redaction -- passwords and keys in config files are replaced with
[REDACTED]before capture. - 1MB cap on config file reads -- prevents memory issues with unexpectedly large files.
- 20 deployment cap -- WAR introspection is capped to prevent runaway execution on servers with many deployments.
- Default keystore password probing -- tries
changeitandchangemeonly. Never attempts brute force. Flags default passwords as a security finding. - No Tower/AAP dependency -- produces standalone JSON files that can be consumed by any downstream tool.
- MTA boundary -- explicitly separates infrastructure discovery from code analysis to avoid duplicating MTA capabilities.
License
Apache-2.0