Compare commits

...

11 Commits

Author SHA1 Message Date
CanbiZ (MickLesk)
8f30db9cc3 Implement telemetry settings and repo source detection
Add telemetry configuration and repository source detection function.
2026-02-17 08:37:58 +01:00
CanbiZ (MickLesk)
2d7e707a00 fix(build): preserve exit code in ERR trap to prevent false exit_code=0
The ERR trap called ensure_log_on_host before post_update_to_api,
which reset \True to 0 (success). This caused ~15-20 records/day to be
reported as 'failed' with exit_code=0 instead of the actual error code.

Root cause chain:
1. Command fails with exit code N → ERR trap fires (\True = N)
2. ensure_log_on_host succeeds → \True becomes 0
3. post_update_to_api 'failed' '\True' → sends 'failed/0' (wrong!)
4. POST_UPDATE_DONE=true → EXIT trap skips the correct code

Fix: capture \True into _ERR_CODE before ensure_log_on_host runs.
2026-02-16 18:35:33 +01:00
CanbiZ (MickLesk)
ccccd8abca fix: sync error_handler fallback, Alpine APK repair, retry limit
error_handler.func:
- Sync fallback explain_exit_code() with api.func: add 25+ codes that
  were missing (curl 16/18/24/26/27/32-34/36/39/44-48/51/52/55/57/59/
  61/63/79/92/95, signals 125/129/131/132/144/146, npm 239, code 3/8)
- Ensures consistent error descriptions even when api.func isn't loaded

build.func:
- Alpine APK repair: detect var_os=alpine and run 'apk fix && apk
  cache clean && apk update' instead of apt-get/dpkg commands
- Show 'Repair APK state' instead of 'APT/DPKG' in menu for Alpine
- Retry safety counter: OOM x2 retry limited to max 2 attempts
  (prevents infinite RAM doubling via RECOVERY_ATTEMPT env var)
- Show attempt count in rebuild summary
2026-02-16 18:24:33 +01:00
CanbiZ (MickLesk)
f463bd5029 feat(api+build): map 25 more exit codes, add SIGHUP trap, network/perm hints
api.func:
- Map 25+ new exit codes that were showing as 'Unknown' in telemetry:
  curl: 3, 16, 18, 24, 26, 32-34, 39, 44, 46, 48, 51, 52, 57, 59, 61,
  63, 79, 92, 95; signals: 125, 132, 144, 146
- Update code 8 description (FTP + apk untrusted key)
- Update header comment with full supported ranges

build.func:
- Add SIGHUP trap: reports 'failed/129' to API when terminal is closed,
  should significantly reduce the 2841 stuck 'installing' records
- Add exit 52 (empty reply) and 57 (poll error) to network issue
  detection for DNS override recovery option
- Add exit 125/126 hint: suggests privileged mode for permission errors
2026-02-16 18:15:28 +01:00
CanbiZ (MickLesk)
eef38514b1 feat(build): APT in-place repair, exit 1 subclassification, new exit codes
- Add APT/DPKG in-place recovery: detects exit 100/101/102/255 and exit 1
  with APT log patterns, offers to repair dpkg state and re-run install
  script without destroying the container
- Add exit 1 subclassification: analyzes combined log to identify root
  cause (APT, OOM, network, command-not-found) and routes to appropriate
  recovery option
- Add exit 10 hint: shows privileged mode / nesting suggestion
- Add exit 127 hint: extracts missing command name from logs
- Refactor recovery menu: use named option variables (APT_OPTION,
  OOM_OPTION, DNS_OPTION) instead of hardcoded option numbers, supports
  up to 6 dynamic options cleanly
- Map missing exit codes in api.func: curl 27/36/45/47/55, signals
  129 (SIGHUP) / 131 (SIGQUIT), npm 239
2026-02-16 18:06:44 +01:00
CanbiZ (MickLesk)
03f5cd9de5 fix(build): restore smart recovery and add OOM/DNS retry paths 2026-02-16 17:46:43 +01:00
CanbiZ (MickLesk)
9c03b34e7d Merge remote-tracking branch 'origin/main' into feature/smart-error-recovery
# Conflicts:
#	misc/api.func
#	misc/build.func
#	misc/error_handler.func
2026-02-16 17:39:36 +01:00
MickLesk
28c19a79d3 feat(exit-codes): add systemd and build error codes (150-154)
- 150: Systemd service failed to start
- 151: Systemd service unit not found
- 152: Permission denied (EACCES)
- 153: Build/compile failed (make/gcc/cmake)
- 154: Node.js native addon build failed (node-gyp)

Based on issue analysis: 57 service failures, 25 build failures, 22 node-gyp issues
2026-01-26 20:48:17 +01:00
MickLesk
8e4d5b1d28 fix(exit-codes): sync error_handler.func and api.func with conflict-free code ranges
- Add curl error codes (6, 7, 22, 28, 35)
- Add APT lock code (102), timeout (124), signals (134, 141)
- Move Python codes: 210-212 → 160-162 (avoid Proxmox conflict)
- Move PostgreSQL codes: 231-234 → 170-173
- Move MySQL/MariaDB codes: 241-244 → 180-183
- Move MongoDB codes: 251-254 → 190-193
- Keep Node.js at 243-249, Proxmox at 200-231
- Both files now synchronized with identical mappings
2026-01-26 20:28:37 +01:00
MickLesk
e731b9fb65 fix(api.func): fix duplicate exit codes and add missing error codes
Exit code fixes:
- Remove duplicate definitions for codes 243, 254 (Node.js vs DB)
- Reassign MySQL/MariaDB to 240-242, 244 (was 241-244)
- Reassign MongoDB to 250-253 (was 251-254)

New exit codes added (based on GitHub issues analysis):
- 6: curl couldn't resolve host (DNS failure)
- 7: curl failed to connect (network unreachable)
- 22: curl HTTP error (404, 429 rate limit, 500)
- 28: curl timeout (very common in download failures)
- 35: curl SSL error
- 102: APT lock held by another process
- 124: Command timeout
- 141: SIGPIPE (broken pipe)

Also update OOM detection to include exit code 134 (SIGABRT)
which is commonly seen in Node.js heap overflow issues.

Fixes based on analysis of ~500 GitHub issues.
2026-01-26 20:23:04 +01:00
MickLesk
93cb6f99fe feat(build.func): smart error recovery menu for failed installations
Replace simple Y/n removal prompt with interactive recovery menu:

- Option 1: Remove container and exit (default, auto after 60s timeout)
- Option 2: Keep container for debugging
- Option 3: Retry installation with verbose mode enabled
- Option 4: Retry with 1.5x RAM and +1 CPU core (OOM errors only)

Improvements:
- Detect OOM errors (exit codes 137, 243) and offer resource increase
- Show human-readable error explanation using explain_exit_code()
- Recursive rebuild preserves ALL settings from advanced/app.vars/default.vars
- Settings preserved: Network (IP, Gateway, VLAN, MTU, Bridge), Features
  (Nesting, FUSE, TUN, GPU), Storage, SSH keys, Tags, Hostname, etc.
- Show rebuild summary before retry (old→new CTID, resources, network)
- New container ID generated automatically for rebuilds

This helps users recover from transient failures without re-running
the entire script manually.
2026-01-26 20:19:22 +01:00
3 changed files with 387 additions and 23 deletions

View File

@@ -117,16 +117,17 @@ detect_repo_source
# - Canonical source of truth for ALL exit code mappings
# - Used by both api.func (telemetry) and error_handler.func (error display)
# - Supports:
# * Generic/Shell errors (1, 2, 124, 126-130, 134, 137, 139, 141, 143)
# * curl/wget errors (6, 7, 22, 28, 35)
# * Generic/Shell errors (1-3, 10, 124-132, 134, 137, 139, 141, 143-146)
# * curl/wget errors (4-8, 16, 18, 22-28, 30, 32-36, 39, 44-48, 51-52, 55-57, 59, 61, 63, 75, 78-79, 92, 95)
# * Package manager errors (APT, DPKG: 100-102, 255)
# * BSD sysexits (64-78)
# * Systemd/Service errors (150-154)
# * Python/pip/uv errors (160-162)
# * PostgreSQL errors (170-173)
# * MySQL/MariaDB errors (180-183)
# * MongoDB errors (190-193)
# * Proxmox custom codes (200-231)
# * Node.js/npm errors (243, 245-249)
# * Node.js/npm errors (239, 243, 245-249)
# - Returns description string for given exit code
# ------------------------------------------------------------------------------
explain_exit_code() {
@@ -135,6 +136,7 @@ explain_exit_code() {
# --- Generic / Shell ---
1) echo "General error / Operation not permitted" ;;
2) echo "Misuse of shell builtins (e.g. syntax error)" ;;
3) echo "General syntax or argument error" ;;
10) echo "Docker / privileged mode required (unsupported environment)" ;;
# --- curl / wget errors (commonly seen in downloads) ---
@@ -142,16 +144,41 @@ explain_exit_code() {
5) echo "curl: Could not resolve proxy" ;;
6) echo "curl: DNS resolution failed (could not resolve host)" ;;
7) echo "curl: Failed to connect (network unreachable / host down)" ;;
8) echo "curl: FTP server reply error" ;;
8) echo "curl: Server reply error (FTP/SFTP or apk untrusted key)" ;;
16) echo "curl: HTTP/2 framing layer error" ;;
18) echo "curl: Partial file (transfer not completed)" ;;
22) echo "curl: HTTP error returned (404, 429, 500+)" ;;
23) echo "curl: Write error (disk full or permissions)" ;;
24) echo "curl: Write to local file failed" ;;
25) echo "curl: Upload failed" ;;
26) echo "curl: Read error on local file (I/O)" ;;
27) echo "curl: Out of memory (memory allocation failed)" ;;
28) echo "curl: Operation timeout (network slow or server not responding)" ;;
30) echo "curl: FTP port command failed" ;;
32) echo "curl: FTP SIZE command failed" ;;
33) echo "curl: HTTP range error" ;;
34) echo "curl: HTTP post error" ;;
35) echo "curl: SSL/TLS handshake failed (certificate error)" ;;
36) echo "curl: FTP bad download resume" ;;
39) echo "curl: LDAP search failed" ;;
44) echo "curl: Internal error (bad function call order)" ;;
45) echo "curl: Interface error (failed to bind to specified interface)" ;;
46) echo "curl: Bad password entered" ;;
47) echo "curl: Too many redirects" ;;
48) echo "curl: Unknown command line option specified" ;;
51) echo "curl: SSL peer certificate or SSH host key verification failed" ;;
52) echo "curl: Empty reply from server (got nothing)" ;;
55) echo "curl: Failed sending network data" ;;
56) echo "curl: Receive error (connection reset by peer)" ;;
57) echo "curl: Unrecoverable poll/select error (system I/O failure)" ;;
59) echo "curl: Couldn't use specified SSL cipher" ;;
61) echo "curl: Bad/unrecognized transfer encoding" ;;
63) echo "curl: Maximum file size exceeded" ;;
75) echo "Temporary failure (retry later)" ;;
78) echo "curl: Remote file not found (404 on FTP/file)" ;;
79) echo "curl: SSH session error (key exchange/auth failed)" ;;
92) echo "curl: HTTP/2 stream error (protocol violation)" ;;
95) echo "curl: HTTP/3 layer error" ;;
# --- Package manager / APT / DPKG ---
100) echo "APT: Package manager error (broken packages / dependency problems)" ;;
@@ -175,15 +202,21 @@ explain_exit_code() {
# --- Common shell/system errors ---
124) echo "Command timed out (timeout command)" ;;
125) echo "Command failed to start (Docker daemon or execution error)" ;;
126) echo "Command invoked cannot execute (permission problem?)" ;;
127) echo "Command not found" ;;
128) echo "Invalid argument to exit" ;;
129) echo "Killed by SIGHUP (terminal closed / hangup)" ;;
130) echo "Aborted by user (SIGINT)" ;;
131) echo "Killed by SIGQUIT (core dumped)" ;;
132) echo "Killed by SIGILL (illegal CPU instruction)" ;;
134) echo "Process aborted (SIGABRT - possibly Node.js heap overflow)" ;;
137) echo "Killed (SIGKILL / Out of memory?)" ;;
139) echo "Segmentation fault (core dumped)" ;;
141) echo "Broken pipe (SIGPIPE - output closed prematurely)" ;;
143) echo "Terminated (SIGTERM)" ;;
144) echo "Killed by signal 16 (SIGUSR1 / SIGSTKFLT)" ;;
146) echo "Killed by signal 18 (SIGTSTP)" ;;
# --- Systemd / Service errors (150-154) ---
150) echo "Systemd: Service failed to start" ;;
@@ -191,7 +224,6 @@ explain_exit_code() {
152) echo "Permission denied (EACCES)" ;;
153) echo "Build/compile failed (make/gcc/cmake)" ;;
154) echo "Node.js: Native addon build failed (node-gyp)" ;;
# --- Python / pip / uv (160-162) ---
160) echo "Python: Virtualenv / uv environment missing or broken" ;;
161) echo "Python: Dependency resolution failed" ;;
@@ -242,7 +274,8 @@ explain_exit_code() {
225) echo "Proxmox: No template available for OS/Version" ;;
231) echo "Proxmox: LXC stack upgrade failed" ;;
# --- Node.js / npm / pnpm / yarn (243-249) ---
# --- Node.js / npm / pnpm / yarn (239-249) ---
239) echo "npm/Node.js: Unexpected runtime error or dependency failure" ;;
243) echo "Node.js: Out of memory (JavaScript heap out of memory)" ;;
245) echo "Node.js: Invalid command-line option" ;;
246) echo "Node.js: Internal JavaScript Parse Error" ;;

View File

@@ -297,7 +297,7 @@ validate_container_id() {
# Falls back gracefully if pvesh unavailable or returns empty
if command -v pvesh &>/dev/null; then
local cluster_ids
cluster_ids=$(pvesh get /cluster/resources --type vm --output-format json 2>/dev/null |
cluster_ids=$(pvesh get /cluster/resources --type vm --output-format json 2>/dev/null |
grep -oP '"vmid":\s*\K[0-9]+' 2>/dev/null || true)
if [[ -n "$cluster_ids" ]] && echo "$cluster_ids" | grep -qw "$ctid"; then
return 1
@@ -4038,6 +4038,13 @@ EOF'
msg_ok "Customized LXC Container"
# Optional DNS override for retry scenarios (inside LXC, never on host)
if [[ "${DNS_RETRY_OVERRIDE:-false}" == "true" ]]; then
msg_info "Applying DNS retry override in LXC (8.8.8.8, 1.1.1.1)"
pct exec "$CTID" -- bash -c "printf 'nameserver 8.8.8.8\nnameserver 1.1.1.1\n' >/etc/resolv.conf" >/dev/null 2>&1 || true
msg_ok "DNS override applied in LXC"
fi
# Install SSH keys
install_ssh_keys_into_ct
@@ -4150,32 +4157,322 @@ EOF'
# Prompt user for cleanup with 60s timeout
echo ""
echo -en "${TAB}${TAB}${YW}Remove broken container ${CTID}? (Y/n) [auto-remove in 60s]: ${CL}"
# Detect error type for smart recovery options
local is_oom=false
local is_network_issue=false
local is_apt_issue=false
local is_cmd_not_found=false
local error_explanation=""
if declare -f explain_exit_code >/dev/null 2>&1; then
error_explanation="$(explain_exit_code "$install_exit_code")"
fi
# OOM detection: exit codes 134 (SIGABRT/heap), 137 (SIGKILL/OOM), 243 (Node.js heap)
if [[ $install_exit_code -eq 134 || $install_exit_code -eq 137 || $install_exit_code -eq 243 ]]; then
is_oom=true
fi
# APT/DPKG detection: exit codes 100-102 (APT), 255 (DPKG with log evidence)
case "$install_exit_code" in
100 | 101 | 102) is_apt_issue=true ;;
255)
if [[ -f "$combined_log" ]] && grep -qiE 'dpkg|apt-get|apt\.conf|broken packages|unmet dependencies|E: Sub-process|E: Failed' "$combined_log"; then
is_apt_issue=true
fi
;;
esac
# Command not found detection
if [[ $install_exit_code -eq 127 ]]; then
is_cmd_not_found=true
fi
# Network-related detection (curl/apt/git fetch failures and transient network issues)
case "$install_exit_code" in
6 | 7 | 22 | 28 | 35 | 52 | 56 | 57 | 75 | 78) is_network_issue=true ;;
100)
# APT can fail due to network (Failed to fetch)
if [[ -f "$combined_log" ]] && grep -qiE 'Failed to fetch|Could not resolve|Connection failed|Network is unreachable|Temporary failure resolving' "$combined_log"; then
is_network_issue=true
fi
;;
128)
if [[ -f "$combined_log" ]] && grep -qiE 'RPC failed|early EOF|fetch-pack|HTTP/2 stream|Could not resolve host|Temporary failure resolving|Failed to fetch|Connection reset|Network is unreachable' "$combined_log"; then
is_network_issue=true
fi
;;
esac
# Exit 1 subclassification: analyze logs to identify actual root cause
# Many exit 1 errors are actually APT, OOM, network, or command-not-found issues
if [[ $install_exit_code -eq 1 && -f "$combined_log" ]]; then
if grep -qiE 'E: Unable to|E: Package|E: Failed to fetch|dpkg.*error|broken packages|unmet dependencies|dpkg --configure -a' "$combined_log"; then
is_apt_issue=true
fi
if grep -qiE 'Cannot allocate memory|Out of memory|oom-killer|Killed process|JavaScript heap' "$combined_log"; then
is_oom=true
fi
if grep -qiE 'Could not resolve|DNS|Connection refused|Network is unreachable|No route to host|Temporary failure resolving|Failed to fetch' "$combined_log"; then
is_network_issue=true
fi
if grep -qiE ': command not found|No such file or directory.*/s?bin/' "$combined_log"; then
is_cmd_not_found=true
fi
fi
# Show error explanation if available
if [[ -n "$error_explanation" ]]; then
echo -e "${TAB}${RD}Error: ${error_explanation}${CL}"
echo ""
fi
# Show specific hints for known error types
if [[ $install_exit_code -eq 10 ]]; then
echo -e "${TAB}${INFO} This error usually means the container needs ${GN}privileged${CL} mode or Docker/nesting support."
echo -e "${TAB}${INFO} Recreate with: Advanced Install → Container Type: ${GN}Privileged${CL}"
echo ""
fi
if [[ $install_exit_code -eq 125 || $install_exit_code -eq 126 ]]; then
echo -e "${TAB}${INFO} The command exists but cannot be executed. This may be a ${GN}permission${CL} issue."
echo -e "${TAB}${INFO} If using Docker, ensure the container is ${GN}privileged${CL} or has correct permissions."
echo ""
fi
if [[ "$is_cmd_not_found" == true ]]; then
local missing_cmd=""
if [[ -f "$combined_log" ]]; then
missing_cmd=$(grep -oiE '[a-zA-Z0-9_.-]+: command not found' "$combined_log" | tail -1 | sed 's/: command not found//')
fi
if [[ -n "$missing_cmd" ]]; then
echo -e "${TAB}${INFO} Missing command: ${GN}${missing_cmd}${CL}"
fi
echo ""
fi
# Build recovery menu based on error type
echo -e "${YW}What would you like to do?${CL}"
echo ""
echo -e " ${GN}1)${CL} Remove container and exit"
echo -e " ${GN}2)${CL} Keep container for debugging"
echo -e " ${GN}3)${CL} Retry with verbose mode (full rebuild)"
local next_option=4
local APT_OPTION="" OOM_OPTION="" DNS_OPTION=""
if [[ "$is_apt_issue" == true ]]; then
if [[ "$var_os" == "alpine" ]]; then
echo -e " ${GN}${next_option})${CL} Repair APK state and re-run install (in-place)"
else
echo -e " ${GN}${next_option})${CL} Repair APT/DPKG state and re-run install (in-place)"
fi
APT_OPTION=$next_option
next_option=$((next_option + 1))
fi
if [[ "$is_oom" == true ]]; then
local recovery_attempt="${RECOVERY_ATTEMPT:-0}"
if [[ $recovery_attempt -lt 2 ]]; then
local new_ram=$((RAM_SIZE * 2))
local new_cpu=$((CORE_COUNT * 2))
echo -e " ${GN}${next_option})${CL} Retry with more resources (RAM: ${RAM_SIZE}${new_ram} MiB, CPU: ${CORE_COUNT}${new_cpu} cores)"
OOM_OPTION=$next_option
next_option=$((next_option + 1))
else
echo -e " ${DGN}-)${CL} ${DGN}OOM retry exhausted (already retried ${recovery_attempt}x)${CL}"
fi
fi
if [[ "$is_network_issue" == true ]]; then
echo -e " ${GN}${next_option})${CL} Retry with DNS override in LXC (8.8.8.8 / 1.1.1.1)"
DNS_OPTION=$next_option
next_option=$((next_option + 1))
fi
local max_option=$((next_option - 1))
echo ""
echo -en "${YW}Select option [1-${max_option}] (default: 1, auto-remove in 60s): ${CL}"
if read -t 60 -r response; then
if [[ -z "$response" || "$response" =~ ^[Yy]$ ]]; then
case "${response:-1}" in
1)
# Remove container
echo ""
msg_info "Removing container ${CTID}"
echo -e "\n${TAB}${HOLD}${YW}Removing container ${CTID}${CL}"
pct stop "$CTID" &>/dev/null || true
pct destroy "$CTID" &>/dev/null || true
msg_ok "Container ${CTID} removed"
elif [[ "$response" =~ ^[Nn]$ ]]; then
echo ""
msg_warn "Container ${CTID} kept for debugging"
echo -e "${BFR}${CM}${GN}Container ${CTID} removed${CL}"
;;
2)
echo -e "\n${TAB}${YW}Container ${CTID} kept for debugging${CL}"
# Dev mode: Setup MOTD/SSH for debugging access to broken container
if [[ "${DEV_MODE_MOTD:-false}" == "true" ]]; then
echo -e "${TAB}${HOLD}${DGN}Setting up MOTD and SSH for debugging...${CL}"
if pct exec "$CTID" -- bash -c "
source <(curl -fsSL https://raw.githubusercontent.com/community-scripts/ProxmoxVE/main/misc/install.func)
declare -f motd_ssh >/dev/null 2>&1 && motd_ssh || true
" >/dev/null 2>&1; then
source <(curl -fsSL https://raw.githubusercontent.com/community-scripts/ProxmoxVE/main/misc/install.func)
declare -f motd_ssh >/dev/null 2>&1 && motd_ssh || true
" >/dev/null 2>&1; then
local ct_ip=$(pct exec "$CTID" ip a s dev eth0 2>/dev/null | awk '/inet / {print $2}' | cut -d/ -f1)
echo -e "${BFR}${CM}${GN}MOTD/SSH ready - SSH into container: ssh root@${ct_ip}${CL}"
fi
fi
fi
exit $install_exit_code
;;
3)
# Retry with verbose mode (full rebuild)
echo -e "\n${TAB}${HOLD}${YW}Removing container ${CTID} for rebuild...${CL}"
pct stop "$CTID" &>/dev/null || true
pct destroy "$CTID" &>/dev/null || true
echo -e "${BFR}${CM}${GN}Container ${CTID} removed${CL}"
echo ""
# Get new container ID
local old_ctid="$CTID"
export CTID=$(get_valid_container_id "$CTID")
export VERBOSE="yes"
export var_verbose="yes"
# Show rebuild summary
echo -e "${YW}Rebuilding with preserved settings:${CL}"
echo -e " Container ID: ${old_ctid}${CTID}"
echo -e " RAM: ${RAM_SIZE} MiB | CPU: ${CORE_COUNT} cores | Disk: ${DISK_SIZE} GB"
echo -e " Network: ${NET:-dhcp} | Bridge: ${BRG:-vmbr0}"
echo -e " Verbose: ${GN}enabled${CL}"
echo ""
msg_info "Restarting installation..."
# Re-run build_container
build_container
return $?
;;
*)
# Handle dynamic smart recovery options via named option variables
local handled=false
if [[ -n "${APT_OPTION}" && "${response}" == "${APT_OPTION}" ]]; then
# Package manager in-place repair: fix broken state and re-run install script
handled=true
if [[ "$var_os" == "alpine" ]]; then
echo -e "\n${TAB}${HOLD}${YW}Repairing APK state in container ${CTID}...${CL}"
pct exec "$CTID" -- ash -c "
apk fix 2>/dev/null || true
apk cache clean 2>/dev/null || true
apk update 2>/dev/null || true
" >/dev/null 2>&1 || true
echo -e "${BFR}${CM}${GN}APK state repaired in container ${CTID}${CL}"
else
echo -e "\n${TAB}${HOLD}${YW}Repairing APT/DPKG state in container ${CTID}...${CL}"
pct exec "$CTID" -- bash -c "
DEBIAN_FRONTEND=noninteractive dpkg --configure -a 2>/dev/null || true
apt-get -f install -y 2>/dev/null || true
apt-get clean 2>/dev/null
apt-get update 2>/dev/null || true
" >/dev/null 2>&1 || true
echo -e "${BFR}${CM}${GN}APT/DPKG state repaired in container ${CTID}${CL}"
fi
echo ""
export VERBOSE="yes"
export var_verbose="yes"
echo -e "${YW}Re-running installation in existing container ${CTID}:${CL}"
echo -e " RAM: ${RAM_SIZE} MiB | CPU: ${CORE_COUNT} cores | Disk: ${DISK_SIZE} GB"
echo -e " Verbose: ${GN}enabled${CL}"
echo ""
msg_info "Re-running installation script..."
# Re-run install script in existing container (don't destroy/recreate)
set +Eeuo pipefail
trap - ERR
lxc-attach -n "$CTID" -- bash -c "$(curl -fsSL https://raw.githubusercontent.com/community-scripts/ProxmoxVE/main/install/${var_install}.sh)"
local apt_retry_exit=$?
set -Eeuo pipefail
trap 'error_handler' ERR
# Check for error flag from retry
local apt_retry_code=0
if [[ -n "${SESSION_ID:-}" ]]; then
local retry_error_flag="/root/.install-${SESSION_ID}.failed"
if pct exec "$CTID" -- test -f "$retry_error_flag" 2>/dev/null; then
apt_retry_code=$(pct exec "$CTID" -- cat "$retry_error_flag" 2>/dev/null || echo "1")
pct exec "$CTID" -- rm -f "$retry_error_flag" 2>/dev/null || true
fi
fi
if [[ $apt_retry_code -eq 0 && $apt_retry_exit -ne 0 ]]; then
apt_retry_code=$apt_retry_exit
fi
if [[ $apt_retry_code -eq 0 ]]; then
msg_ok "Installation completed successfully after APT repair!"
post_update_to_api "done" "0" "force"
return 0
else
msg_error "Installation still failed after APT repair (exit code: ${apt_retry_code})"
install_exit_code=$apt_retry_code
fi
fi
if [[ -n "${OOM_OPTION}" && "${response}" == "${OOM_OPTION}" ]]; then
# Retry with doubled resources
handled=true
echo -e "\n${TAB}${HOLD}${YW}Removing container ${CTID} for rebuild with more resources...${CL}"
pct stop "$CTID" &>/dev/null || true
pct destroy "$CTID" &>/dev/null || true
echo -e "${BFR}${CM}${GN}Container ${CTID} removed${CL}"
echo ""
local old_ctid="$CTID"
local old_ram="$RAM_SIZE"
local old_cpu="$CORE_COUNT"
export CTID=$(get_valid_container_id "$CTID")
export RAM_SIZE=$((RAM_SIZE * 2))
export CORE_COUNT=$((CORE_COUNT * 2))
export var_ram="$RAM_SIZE"
export var_cpu="$CORE_COUNT"
export VERBOSE="yes"
export var_verbose="yes"
export RECOVERY_ATTEMPT=$(( ${RECOVERY_ATTEMPT:-0} + 1 ))
echo -e "${YW}Rebuilding with increased resources (attempt ${RECOVERY_ATTEMPT}/2):${CL}"
echo -e " Container ID: ${old_ctid}${CTID}"
echo -e " RAM: ${old_ram}${GN}${RAM_SIZE}${CL} MiB (x2)"
echo -e " CPU: ${old_cpu}${GN}${CORE_COUNT}${CL} cores (x2)"
echo -e " Disk: ${DISK_SIZE} GB | Network: ${NET:-dhcp} | Bridge: ${BRG:-vmbr0}"
echo -e " Verbose: ${GN}enabled${CL}"
echo ""
msg_info "Restarting installation..."
build_container
return $?
fi
if [[ -n "${DNS_OPTION}" && "${response}" == "${DNS_OPTION}" ]]; then
# Retry with DNS override in LXC
handled=true
echo -e "\n${TAB}${HOLD}${YW}Removing container ${CTID} for rebuild with DNS override...${CL}"
pct stop "$CTID" &>/dev/null || true
pct destroy "$CTID" &>/dev/null || true
echo -e "${BFR}${CM}${GN}Container ${CTID} removed${CL}"
echo ""
local old_ctid="$CTID"
export CTID=$(get_valid_container_id "$CTID")
export DNS_RETRY_OVERRIDE="true"
export VERBOSE="yes"
export var_verbose="yes"
echo -e "${YW}Rebuilding with DNS override in LXC:${CL}"
echo -e " Container ID: ${old_ctid}${CTID}"
echo -e " DNS: ${GN}8.8.8.8, 1.1.1.1${CL} (inside LXC only)"
echo -e " Verbose: ${GN}enabled${CL}"
echo ""
msg_info "Restarting installation..."
build_container
return $?
fi
if [[ "$handled" == false ]]; then
echo -e "\n${TAB}${YW}Invalid option. Container ${CTID} kept.${CL}"
exit $install_exit_code
fi
;;
esac
else
# Timeout - auto-remove
echo ""
@@ -5273,6 +5570,7 @@ api_exit_script() {
if command -v pveversion >/dev/null 2>&1; then
trap 'api_exit_script' EXIT
fi
trap 'ensure_log_on_host; post_update_to_api "failed" "$?"' ERR
trap '_ERR_CODE=$?; ensure_log_on_host; post_update_to_api "failed" "$_ERR_CODE"' ERR
trap 'ensure_log_on_host; post_update_to_api "failed" "129"; exit 129' SIGHUP
trap 'ensure_log_on_host; post_update_to_api "failed" "130"; exit 130' SIGINT
trap 'ensure_log_on_host; post_update_to_api "failed" "143"; exit 143' SIGTERM

View File

@@ -37,21 +37,47 @@ if ! declare -f explain_exit_code &>/dev/null; then
case "$code" in
1) echo "General error / Operation not permitted" ;;
2) echo "Misuse of shell builtins (e.g. syntax error)" ;;
3) echo "General syntax or argument error" ;;
10) echo "Docker / privileged mode required (unsupported environment)" ;;
4) echo "curl: Feature not supported or protocol error" ;;
5) echo "curl: Could not resolve proxy" ;;
6) echo "curl: DNS resolution failed (could not resolve host)" ;;
7) echo "curl: Failed to connect (network unreachable / host down)" ;;
8) echo "curl: FTP server reply error" ;;
8) echo "curl: Server reply error (FTP/SFTP or apk untrusted key)" ;;
16) echo "curl: HTTP/2 framing layer error" ;;
18) echo "curl: Partial file (transfer not completed)" ;;
22) echo "curl: HTTP error returned (404, 429, 500+)" ;;
23) echo "curl: Write error (disk full or permissions)" ;;
24) echo "curl: Write to local file failed" ;;
25) echo "curl: Upload failed" ;;
26) echo "curl: Read error on local file (I/O)" ;;
27) echo "curl: Out of memory (memory allocation failed)" ;;
28) echo "curl: Operation timeout (network slow or server not responding)" ;;
30) echo "curl: FTP port command failed" ;;
32) echo "curl: FTP SIZE command failed" ;;
33) echo "curl: HTTP range error" ;;
34) echo "curl: HTTP post error" ;;
35) echo "curl: SSL/TLS handshake failed (certificate error)" ;;
36) echo "curl: FTP bad download resume" ;;
39) echo "curl: LDAP search failed" ;;
44) echo "curl: Internal error (bad function call order)" ;;
45) echo "curl: Interface error (failed to bind to specified interface)" ;;
46) echo "curl: Bad password entered" ;;
47) echo "curl: Too many redirects" ;;
48) echo "curl: Unknown command line option specified" ;;
51) echo "curl: SSL peer certificate or SSH host key verification failed" ;;
52) echo "curl: Empty reply from server (got nothing)" ;;
55) echo "curl: Failed sending network data" ;;
56) echo "curl: Receive error (connection reset by peer)" ;;
57) echo "curl: Unrecoverable poll/select error (system I/O failure)" ;;
59) echo "curl: Couldn't use specified SSL cipher" ;;
61) echo "curl: Bad/unrecognized transfer encoding" ;;
63) echo "curl: Maximum file size exceeded" ;;
75) echo "Temporary failure (retry later)" ;;
78) echo "curl: Remote file not found (404 on FTP/file)" ;;
79) echo "curl: SSH session error (key exchange/auth failed)" ;;
92) echo "curl: HTTP/2 stream error (protocol violation)" ;;
95) echo "curl: HTTP/3 layer error" ;;
64) echo "Usage error (wrong arguments)" ;;
65) echo "Data format error (bad input data)" ;;
66) echo "Input file not found (cannot open input)" ;;
@@ -69,15 +95,21 @@ if ! declare -f explain_exit_code &>/dev/null; then
101) echo "APT: Configuration error (bad sources.list, malformed config)" ;;
102) echo "APT: Lock held by another process (dpkg/apt still running)" ;;
124) echo "Command timed out (timeout command)" ;;
125) echo "Command failed to start (Docker daemon or execution error)" ;;
126) echo "Command invoked cannot execute (permission problem?)" ;;
127) echo "Command not found" ;;
128) echo "Invalid argument to exit" ;;
130) echo "Terminated by Ctrl+C (SIGINT)" ;;
129) echo "Killed by SIGHUP (terminal closed / hangup)" ;;
130) echo "Aborted by user (SIGINT)" ;;
131) echo "Killed by SIGQUIT (core dumped)" ;;
132) echo "Killed by SIGILL (illegal CPU instruction)" ;;
134) echo "Process aborted (SIGABRT - possibly Node.js heap overflow)" ;;
137) echo "Killed (SIGKILL / Out of memory?)" ;;
139) echo "Segmentation fault (core dumped)" ;;
141) echo "Broken pipe (SIGPIPE - output closed prematurely)" ;;
143) echo "Terminated (SIGTERM)" ;;
144) echo "Killed by signal 16 (SIGUSR1 / SIGSTKFLT)" ;;
146) echo "Killed by signal 18 (SIGTSTP)" ;;
150) echo "Systemd: Service failed to start" ;;
151) echo "Systemd: Service unit not found" ;;
152) echo "Permission denied (EACCES)" ;;
@@ -123,6 +155,7 @@ if ! declare -f explain_exit_code &>/dev/null; then
224) echo "Proxmox: PBS storage is for backups only" ;;
225) echo "Proxmox: No template available for OS/Version" ;;
231) echo "Proxmox: LXC stack upgrade failed" ;;
239) echo "npm/Node.js: Unexpected runtime error or dependency failure" ;;
243) echo "Node.js: Out of memory (JavaScript heap out of memory)" ;;
245) echo "Node.js: Invalid command-line option" ;;
246) echo "Node.js: Internal JavaScript Parse Error" ;;