mirror of
https://github.com/community-scripts/ProxmoxVE.git
synced 2026-03-06 10:55:56 +01:00
Compare commits
5 Commits
fix/github
...
feature/vm
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
ed4e69dd75 | ||
|
|
9deba93bcf | ||
|
|
c58a13a230 | ||
|
|
6b249d9533 | ||
|
|
85c3977c73 |
301
docs/vm-smart-recovery-spec.md
Normal file
301
docs/vm-smart-recovery-spec.md
Normal file
@@ -0,0 +1,301 @@
|
|||||||
|
# VM Smart Recovery — Arbeitsanweisung
|
||||||
|
|
||||||
|
**Branch:** `feature/vm-smart-recovery` (basiert auf `main`)
|
||||||
|
**Verwandt:** `feature/smart-error-recovery` (LXC, PR #11221)
|
||||||
|
**Erstellt:** 2026-02-16
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 1. Ausgangslage
|
||||||
|
|
||||||
|
### Architektur-Vergleich LXC vs. VM
|
||||||
|
|
||||||
|
| Aspekt | LXC (fertig in PR #11221) | VM (offen) |
|
||||||
|
|---|---|---|
|
||||||
|
| Shared Code | `misc/build.func` (5577 Zeilen) | `misc/vm-core.func` (627 Zeilen) — **nur von `docker-vm.sh` genutzt** |
|
||||||
|
| Anzahl Scripts | ~170 | 15 |
|
||||||
|
| Architektur | Alle nutzen `build_container()` | **2 Generationen** (s.u.) |
|
||||||
|
| Software-Install | `pct exec` → Install-Script im Container | Variiert: `virt-customize`, Cloud-Init, `qm sendkey`, oder gar nichts |
|
||||||
|
| Telemetrie | `post_to_api()` + `post_update_to_api()` | Identisch — alle sourcen `misc/api.func` |
|
||||||
|
| Error Handling | Zentral in `build.func` Traps | Jedes Script hat eigenen `error_handler()` |
|
||||||
|
| Recovery | Smart-Menü mit 6 dynamischen Optionen | **Keine** — bei Fehler wird VM sofort zerstört (`cleanup_vmid`) |
|
||||||
|
|
||||||
|
### Zwei Generationen von VM-Scripts
|
||||||
|
|
||||||
|
**Generation 1 — Legacy (monolithisch):** `haos-vm.sh`, `debian-vm.sh`, `openwrt-vm.sh` und 11 weitere.
|
||||||
|
- Selbstständige 500–700-Zeilen-Scripts
|
||||||
|
- Definieren **alle** Utility-Funktionen inline (Colors, Icons, `msg_info`/`msg_ok`, `error_handler`, `cleanup`, etc.)
|
||||||
|
- Sourcen nur `misc/api.func` für Telemetrie
|
||||||
|
|
||||||
|
**Generation 2 — Modern (modular):** Ausschließlich `docker-vm.sh`.
|
||||||
|
- Sourced drei Shared Libraries:
|
||||||
|
- `misc/api.func` — Telemetrie
|
||||||
|
- `misc/vm-core.func` — Shared Utilities (627 Zeilen)
|
||||||
|
- `misc/cloud-init.func` — Cloud-Init Konfiguration (709 Zeilen)
|
||||||
|
- Ruft `load_functions` aus `vm-core.func` auf
|
||||||
|
|
||||||
|
### Telemetrie-Daten (Top VM-Failures)
|
||||||
|
|
||||||
|
| Script | Anteil an VM-Failures |
|
||||||
|
|---|---|
|
||||||
|
| `docker-vm.sh` | 30.1 % |
|
||||||
|
| `openwrt-vm.sh` | 25.9 % |
|
||||||
|
| `debian-13-vm.sh` | 9.6 % |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 2. Scope & Abgrenzung
|
||||||
|
|
||||||
|
### In Scope
|
||||||
|
|
||||||
|
- Smart Recovery für VM-Erstellungsfehler (Retry-Menü analog LXC)
|
||||||
|
- Fehlererkennung: Download, Disk-Import, virt-customize, Ressourcen-Konflikte, Netzwerk
|
||||||
|
- Exit-Code-Mapping (bereits in `api.func` vorhanden, wird geteilt)
|
||||||
|
|
||||||
|
### Out of Scope (bewusst)
|
||||||
|
|
||||||
|
- **Migration aller Legacy-Scripts auf `vm-core.func`** → eigenes Refactoring-Ticket
|
||||||
|
- **In-VM-Repair** → VMs haben kein `pct exec`-Äquivalent
|
||||||
|
- **`qm sendkey`-Recovery** (OpenWrt) → prinzipbedingt nicht retryable
|
||||||
|
- **APT/DPKG-Repair innerhalb der VM** → kein Shell-Zugang während Install
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 3. Software-Installationsmethoden pro Script
|
||||||
|
|
||||||
|
| Script | Methode | Beschreibung |
|
||||||
|
|---|---|---|
|
||||||
|
| `docker-vm.sh` | `virt-customize` | Offline Image-Manipulation (libguestfs) |
|
||||||
|
| `docker-vm.sh` (Fallback) | systemd First-Boot-Service | Script läuft in VM beim ersten Boot |
|
||||||
|
| `haos-vm.sh` | Keine | Pre-built Appliance (qcow2) |
|
||||||
|
| `debian-vm.sh` / `debian-13-vm.sh` | Keine / Cloud-Init | Basis Cloud-Image |
|
||||||
|
| `openwrt-vm.sh` | `qm sendkey` | Virtuelle Tastatur-Automation |
|
||||||
|
| `opnsense-vm.sh` | `qm sendkey` + Bootstrap | Virtuelle Tastatur |
|
||||||
|
| `ubuntu-*-vm.sh` | Cloud-Init | User konfiguriert vor Start |
|
||||||
|
| `owncloud-vm.sh` | `virt-customize` | Wie docker-vm.sh |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 4. Dateien & Änderungen
|
||||||
|
|
||||||
|
### 4.1 `misc/vm-core.func` — Zentrale Recovery-Logik
|
||||||
|
|
||||||
|
#### Neue Funktion: `vm_error_handler_with_recovery()`
|
||||||
|
|
||||||
|
```
|
||||||
|
Ablauf:
|
||||||
|
├── Exit-Code erfassen ($? als ERSTES — kein ensure_log_on_host davor!)
|
||||||
|
├── Fehlerklassifikation:
|
||||||
|
│ ├── Download-Fehler (curl exit 6/7/22/28/35/52/56)
|
||||||
|
│ ├── Disk-Import-Fehler (qm importdisk, pvesm alloc)
|
||||||
|
│ ├── virt-customize-Fehler (libguestfs)
|
||||||
|
│ ├── Ressourcen-Konflikt (VMID exists, Storage full)
|
||||||
|
│ └── Netzwerk-Fehler (DNS, Timeout)
|
||||||
|
├── Smart Recovery Menü:
|
||||||
|
│ ├── [1] Retry (VM zerstören & neu erstellen)
|
||||||
|
│ ├── [2] Retry mit anderen Einstellungen (RAM/CPU/Disk ändern)
|
||||||
|
│ ├── [3] VM behalten (nicht zerstören, manuell debuggen)
|
||||||
|
│ ├── [4] Abbrechen (VM zerstören, Exit)
|
||||||
|
│ └── Dynamische Optionen je nach Fehlertyp:
|
||||||
|
│ ├── Download-Fehler → "Cache löschen & neu downloaden"
|
||||||
|
│ └── Ressourcen-Konflikt → "Andere VMID wählen"
|
||||||
|
└── Bei Retry: cleanup_vmid() + create-Funktion erneut aufrufen
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Neue Helper-Funktionen (Fehlererkennung):
|
||||||
|
|
||||||
|
```bash
|
||||||
|
is_download_error() # curl exit codes + HTTP 404/500
|
||||||
|
is_disk_import_error() # qm importdisk stderr patterns
|
||||||
|
is_virt_customize_err() # libguestfs error patterns
|
||||||
|
is_vmid_conflict() # "already exists" in stderr
|
||||||
|
is_storage_full() # "not enough space" patterns
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Log-Erfassung für VMs
|
||||||
|
|
||||||
|
Anders als LXC (wo `/root/.install*.log` im Container liegt) müssen VM-Fehler direkt aus stderr der `qm`/`virt-customize` Befehle erfasst werden:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Jeder kritische Befehl mit stderr-Capture:
|
||||||
|
VM_ERROR_LOG="/tmp/vm-install-${VMID}.log"
|
||||||
|
qm importdisk "$VMID" "$IMAGE" "$STORAGE" 2>> "$VM_ERROR_LOG"
|
||||||
|
virt-customize -a "$IMAGE" --install docker.io 2>> "$VM_ERROR_LOG"
|
||||||
|
```
|
||||||
|
|
||||||
|
### 4.2 Retry-Wrapper-Architektur
|
||||||
|
|
||||||
|
Da VMs kein zentrales `build_container()` haben, gibt es zwei Ansätze:
|
||||||
|
|
||||||
|
#### Option A: Wrapper in `vm-core.func` (empfohlen für Gen-2 Scripts)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
vm_create_with_recovery() {
|
||||||
|
local create_fn="$1" # VM-spezifische Erstellungsfunktion
|
||||||
|
local max_retries=2
|
||||||
|
local attempt=0
|
||||||
|
|
||||||
|
while true; do
|
||||||
|
if "$create_fn"; then
|
||||||
|
return 0 # Erfolg
|
||||||
|
fi
|
||||||
|
((attempt++))
|
||||||
|
if ((attempt >= max_retries)); then
|
||||||
|
# Max retries erreicht → nur noch "behalten" oder "abbrechen"
|
||||||
|
fi
|
||||||
|
vm_show_recovery_menu "$?" "$attempt"
|
||||||
|
# Menü-Auswahl verarbeiten...
|
||||||
|
done
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Option B: Inline-Recovery in Legacy-Scripts
|
||||||
|
|
||||||
|
Für die 14 Legacy-Scripts (bis Migration auf `vm-core.func`):
|
||||||
|
- Minimaler Patch: `error_handler()` um Recovery-Prompt erweitern
|
||||||
|
- `cleanup_vmid` **nicht** sofort aufrufen, sondern erst nach User-Entscheidung
|
||||||
|
|
||||||
|
**Empfehlung:** Zunächst **nur `docker-vm.sh`** (30.1 % der Failures) mit Option A umsetzen. Legacy-Scripts als Phase 2 nach Migration.
|
||||||
|
|
||||||
|
### 4.3 `misc/api.func` — Keine Änderungen nötig
|
||||||
|
|
||||||
|
Exit-Code-Mapping (`explain_exit_code()`) und `categorize_error()` sind bereits universal (LXC + VM). Nach Merge von PR #11221 stehen 70+ Exit-Codes zur Verfügung. Falls dieser Branch vorher fertig ist, können die Codes aus `feature/smart-error-recovery` cherry-picked werden.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 5. Wichtige Unterschiede LXC vs. VM Recovery
|
||||||
|
|
||||||
|
| LXC | VM |
|
||||||
|
|---|---|
|
||||||
|
| APT/DPKG In-Place-Repair im Container | **Nicht möglich** — kein Shell-Zugang während Install |
|
||||||
|
| OOM-Retry mit x2 Ressourcen | **Funktioniert** — `qm set` kann RAM/CPU nachträglich ändern |
|
||||||
|
| DNS-Override im Container (`/etc/resolv.conf`) | **Nicht anwendbar** — VM hat eigenes Netzwerk |
|
||||||
|
| Container bleibt erhalten bei Repair | VM muss bei Retry **komplett neu erstellt** werden |
|
||||||
|
| `build_container()` als zentrale Retry-Schleife | **Neue Wrapper-Funktion nötig** (`vm_create_with_recovery`) |
|
||||||
|
| `pct exec` für In-Container-Zugriff | Kein Äquivalent (qemu-guest-agent nur wenn VM läuft + Agent installiert) |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 6. Technische Fallstricke
|
||||||
|
|
||||||
|
### 6.1 VMID-Cleanup vor Retry
|
||||||
|
|
||||||
|
`cleanup_vmid` muss vollständig aufräumen:
|
||||||
|
- `qm stop "$VMID" --skiplock` (falls Running)
|
||||||
|
- `qm destroy "$VMID" --destroy-unreferenced-disks --purge`
|
||||||
|
- Einige Scripts erzeugen zusätzliche Disks (`efidisk0`, `cloudinit`), die extra entfernt werden müssen
|
||||||
|
|
||||||
|
### 6.2 Image-Caching
|
||||||
|
|
||||||
|
`docker-vm.sh` cached Images in `/var/lib/vz/template/cache/`. Bei Download-Retry:
|
||||||
|
- **Behalten**, wenn Download vollständig war (md5/sha-Check)
|
||||||
|
- **Löschen**, wenn Corruption vermutet (curl-Fehler, xz-Validierung fehlgeschlagen)
|
||||||
|
|
||||||
|
### 6.3 Cloud-Init-State
|
||||||
|
|
||||||
|
Wenn Cloud-Init teilweise konfiguriert wurde, muss bei Retry der gesamte State zurückgesetzt werden:
|
||||||
|
```bash
|
||||||
|
qm set "$VMID" --delete cicustom
|
||||||
|
qm set "$VMID" --delete ciuser
|
||||||
|
qm set "$VMID" --delete cipassword
|
||||||
|
```
|
||||||
|
|
||||||
|
### 6.4 Legacy-Scripts (14 Stück)
|
||||||
|
|
||||||
|
- Definieren `error_handler()` inline und sourcen nur `api.func`
|
||||||
|
- Um dort Recovery einzubauen, entweder:
|
||||||
|
- **Jedes Script einzeln patchen** (hohes Risiko, viel Duplikat)
|
||||||
|
- **Erst Migration auf `vm-core.func`** (sauberer, aber größerer Scope)
|
||||||
|
- **Empfehlung:** Migration priorisieren, Recovery danach trivial
|
||||||
|
|
||||||
|
### 6.5 `virt-customize` Fallback
|
||||||
|
|
||||||
|
`docker-vm.sh` hat bereits einen First-Boot-Fallback für Docker-Installation. Wenn `virt-customize` fehlschlägt:
|
||||||
|
- Recovery sollte dies als **"soft failure"** behandeln
|
||||||
|
- Aktiv den Fallback vorschlagen statt blindes Retry
|
||||||
|
|
||||||
|
### 6.6 Kein `pct exec`-Äquivalent
|
||||||
|
|
||||||
|
- Man kann **nicht "in die VM hinein reparieren"** wie bei LXC
|
||||||
|
- `qm guest exec` existiert zwar (mit qemu-guest-agent), funktioniert aber nur wenn:
|
||||||
|
- Die VM läuft
|
||||||
|
- Der Guest Agent installiert ist
|
||||||
|
- Genau das ist typischerweise der Punkt, an dem der Install fehlschlägt
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 7. Implementierungsreihenfolge
|
||||||
|
|
||||||
|
| Phase | Task | Dateien | Impact |
|
||||||
|
|---|---|---|---|
|
||||||
|
| **Phase 1** | `vm_error_handler_with_recovery()` Grundgerüst | `misc/vm-core.func` | Basis für alles |
|
||||||
|
| **Phase 2** | `docker-vm.sh`: Recovery integrieren | `vm/docker-vm.sh` | 30.1 % der Failures |
|
||||||
|
| **Phase 3** | Fehlererkennung (Download, Import, virt-customize) | `misc/vm-core.func` | Intelligente dynamische Menüoptionen |
|
||||||
|
| **Phase 4** | `haos-vm.sh`: Recovery integrieren (Download-Retry) | `vm/haos-vm.sh` | Download-Corruption bereits teilweise vorhanden |
|
||||||
|
| **Phase 5** | `debian-13-vm.sh` + `ubuntu-*-vm.sh` | `vm/debian-13-vm.sh`, etc. | Cloud-Image-Scripts |
|
||||||
|
| **Phase 6** | `openwrt-vm.sh` (limitiert — nur Download/Import-Retry) | `vm/openwrt-vm.sh` | `sendkey`-Teil nicht retryable |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 8. Test-Matrix
|
||||||
|
|
||||||
|
| Szenario | Erwartetes Verhalten |
|
||||||
|
|---|---|
|
||||||
|
| Download-Fehler (curl 6/7/28) | Menü: "Retry Download" + "Cache löschen" |
|
||||||
|
| Disk-Import-Fehler | Menü: "Retry" + "Anderen Storage wählen" |
|
||||||
|
| VMID-Konflikt | Menü: "Andere VMID" + "Bestehende VM zerstören" |
|
||||||
|
| virt-customize-Fehler (docker-vm) | Menü: "Retry" + "First-Boot-Fallback nutzen" |
|
||||||
|
| Storage voll | Menü: "Anderen Storage wählen" + "Disk verkleinern" |
|
||||||
|
| Netzwerk-Timeout | Menü: "Retry" + "Abbrechen" |
|
||||||
|
| 2× Retry erreicht | Nur noch "VM behalten" oder "Abbrechen" |
|
||||||
|
| User wählt "VM behalten" | VM nicht zerstören, manuellen Zugang erklären |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 9. Branch-Workflow
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Neuen Branch erstellen (bereits geschehen):
|
||||||
|
git checkout main
|
||||||
|
git pull origin main
|
||||||
|
git checkout -b feature/vm-smart-recovery
|
||||||
|
|
||||||
|
# Arbeit in Phasen committen:
|
||||||
|
# Phase 1: git commit -m "feat(vm): add vm_error_handler_with_recovery to vm-core.func"
|
||||||
|
# Phase 2: git commit -m "feat(vm): integrate smart recovery into docker-vm.sh"
|
||||||
|
# etc.
|
||||||
|
|
||||||
|
# PR gegen main erstellen (NICHT gegen feature/smart-error-recovery)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Abhängigkeit zu PR #11221
|
||||||
|
|
||||||
|
Die `api.func`-Änderungen aus `feature/smart-error-recovery` (70+ Exit-Codes, `categorize_error()`) werden nach Merge von PR #11221 automatisch in `main` verfügbar sein.
|
||||||
|
|
||||||
|
- Falls VM-Branch **nach** PR #11221 Merge gestartet wird → alles da
|
||||||
|
- Falls VM-Branch **vorher** fertig ist → `api.func` Codes aus `feature/smart-error-recovery` cherry-picken
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 10. Referenz: Exit-0-Bug (nur LXC, gefixt)
|
||||||
|
|
||||||
|
> Dieser Bug betrifft **nur LXC** (`misc/build.func`), nicht die VM-Scripts.
|
||||||
|
|
||||||
|
**Root Cause:** Der ERR-Trap in `build.func` rief `ensure_log_on_host` vor `post_update_to_api` auf. Da `ensure_log_on_host` mit Exit 0 returned, wurde `$?` auf 0 zurückgesetzt → Telemetrie meldete "failed/0" statt dem echten Exit-Code (~15-20 Records/Tag).
|
||||||
|
|
||||||
|
**Fix (PR #11221, Commit `2d7e707a0`):**
|
||||||
|
```bash
|
||||||
|
# Vorher (Bug):
|
||||||
|
trap 'ensure_log_on_host; post_update_to_api "failed" "$?"' ERR
|
||||||
|
|
||||||
|
# Nachher (Fix):
|
||||||
|
trap '_ERR_CODE=$?; ensure_log_on_host; post_update_to_api "failed" "$_ERR_CODE"' ERR
|
||||||
|
```
|
||||||
|
|
||||||
|
**VM-Scripts nicht betroffen:** Diese erfassen `$?` korrekt als erste Zeile in `error_handler()`:
|
||||||
|
```bash
|
||||||
|
function error_handler() {
|
||||||
|
local exit_code="$?" # Erste Zeile → korrekt
|
||||||
|
...
|
||||||
|
}
|
||||||
|
```
|
||||||
@@ -624,3 +624,417 @@ EOF
|
|||||||
qm set "$VMID" -description "$DESCRIPTION" >/dev/null
|
qm set "$VMID" -description "$DESCRIPTION" >/dev/null
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
|
# ==============================================================================
|
||||||
|
# SECTION: VM SMART RECOVERY
|
||||||
|
# ==============================================================================
|
||||||
|
|
||||||
|
# Global error log for VM creation — captures stderr from critical commands
|
||||||
|
VM_ERROR_LOG="${VM_ERROR_LOG:-/tmp/vm-install-$$.log}"
|
||||||
|
VM_RECOVERY_ATTEMPT=${VM_RECOVERY_ATTEMPT:-0}
|
||||||
|
VM_MAX_RETRIES=${VM_MAX_RETRIES:-2}
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------------------
|
||||||
|
# vm_log_cmd()
|
||||||
|
#
|
||||||
|
# - Wraps a command to capture stderr into VM_ERROR_LOG
|
||||||
|
# - Passes stdout through normally
|
||||||
|
# - Returns the original exit code
|
||||||
|
# Usage: vm_log_cmd qm importdisk "$VMID" "$IMAGE" "$STORAGE"
|
||||||
|
# ------------------------------------------------------------------------------
|
||||||
|
vm_log_cmd() {
|
||||||
|
"$@" 2>>"$VM_ERROR_LOG"
|
||||||
|
}
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------------------
|
||||||
|
# is_vm_download_error()
|
||||||
|
#
|
||||||
|
# - Detects download failures based on exit code and error log
|
||||||
|
# - Checks curl exit codes (6, 7, 22, 28, 35, 52, 56) and HTTP error patterns
|
||||||
|
# - Returns 0 (true) if download error detected, 1 otherwise
|
||||||
|
# ------------------------------------------------------------------------------
|
||||||
|
is_vm_download_error() {
|
||||||
|
local exit_code="${1:-0}"
|
||||||
|
local log_file="${2:-$VM_ERROR_LOG}"
|
||||||
|
|
||||||
|
# curl-specific exit codes indicating download issues
|
||||||
|
case "$exit_code" in
|
||||||
|
6 | 7 | 22 | 28 | 35 | 52 | 56) return 0 ;;
|
||||||
|
esac
|
||||||
|
|
||||||
|
# Check log for download-related patterns
|
||||||
|
if [[ -s "$log_file" ]]; then
|
||||||
|
if grep -qiE "curl.*failed|download.*failed|HTTP.*[45][0-9]{2}|Could not resolve|Connection refused|Connection timed out|SSL.*error" "$log_file" 2>/dev/null; then
|
||||||
|
return 0
|
||||||
|
fi
|
||||||
|
fi
|
||||||
|
return 1
|
||||||
|
}
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------------------
|
||||||
|
# is_vm_disk_import_error()
|
||||||
|
#
|
||||||
|
# - Detects disk import failures (qm importdisk / qm disk import)
|
||||||
|
# - Checks for storage allocation and format conversion errors
|
||||||
|
# - Returns 0 (true) if disk import error detected, 1 otherwise
|
||||||
|
# ------------------------------------------------------------------------------
|
||||||
|
is_vm_disk_import_error() {
|
||||||
|
local exit_code="${1:-0}"
|
||||||
|
local log_file="${2:-$VM_ERROR_LOG}"
|
||||||
|
|
||||||
|
if [[ -s "$log_file" ]]; then
|
||||||
|
if grep -qiE "importdisk.*failed|disk import.*error|storage.*allocation.*failed|qcow2.*error|raw.*error|pvesm.*alloc.*failed|unable to create|volume.*already exists" "$log_file" 2>/dev/null; then
|
||||||
|
return 0
|
||||||
|
fi
|
||||||
|
fi
|
||||||
|
return 1
|
||||||
|
}
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------------------
|
||||||
|
# is_vm_virt_customize_error()
|
||||||
|
#
|
||||||
|
# - Detects virt-customize / libguestfs failures
|
||||||
|
# - Checks for guestfs, supermin, appliance boot errors
|
||||||
|
# - Returns 0 (true) if virt-customize error detected, 1 otherwise
|
||||||
|
# ------------------------------------------------------------------------------
|
||||||
|
is_vm_virt_customize_error() {
|
||||||
|
local exit_code="${1:-0}"
|
||||||
|
local log_file="${2:-$VM_ERROR_LOG}"
|
||||||
|
|
||||||
|
if [[ -s "$log_file" ]]; then
|
||||||
|
if grep -qiE "virt-customize|libguestfs|guestfs|supermin|appliance.*boot|virt-.*failed|launch.*failed" "$log_file" 2>/dev/null; then
|
||||||
|
return 0
|
||||||
|
fi
|
||||||
|
fi
|
||||||
|
return 1
|
||||||
|
}
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------------------
|
||||||
|
# is_vm_vmid_conflict()
|
||||||
|
#
|
||||||
|
# - Detects VMID conflicts (VM already exists)
|
||||||
|
# - Returns 0 (true) if conflict detected, 1 otherwise
|
||||||
|
# ------------------------------------------------------------------------------
|
||||||
|
is_vm_vmid_conflict() {
|
||||||
|
local exit_code="${1:-0}"
|
||||||
|
local log_file="${2:-$VM_ERROR_LOG}"
|
||||||
|
|
||||||
|
if [[ -s "$log_file" ]]; then
|
||||||
|
if grep -qiE "already exists|VM $VMID already|unable to create VM|VMID.*in use" "$log_file" 2>/dev/null; then
|
||||||
|
return 0
|
||||||
|
fi
|
||||||
|
fi
|
||||||
|
return 1
|
||||||
|
}
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------------------
|
||||||
|
# is_vm_storage_full()
|
||||||
|
#
|
||||||
|
# - Detects storage full / space exhaustion errors
|
||||||
|
# - Returns 0 (true) if storage space issue detected, 1 otherwise
|
||||||
|
# ------------------------------------------------------------------------------
|
||||||
|
is_vm_storage_full() {
|
||||||
|
local exit_code="${1:-0}"
|
||||||
|
local log_file="${2:-$VM_ERROR_LOG}"
|
||||||
|
|
||||||
|
if [[ -s "$log_file" ]]; then
|
||||||
|
if grep -qiE "not enough space|no space left|storage.*full|disk quota|ENOSPC|insufficient.*space|thin pool.*full" "$log_file" 2>/dev/null; then
|
||||||
|
return 0
|
||||||
|
fi
|
||||||
|
fi
|
||||||
|
return 1
|
||||||
|
}
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------------------
|
||||||
|
# is_vm_network_error()
|
||||||
|
#
|
||||||
|
# - Detects general network/DNS errors beyond download failures
|
||||||
|
# - Returns 0 (true) if network issue detected, 1 otherwise
|
||||||
|
# ------------------------------------------------------------------------------
|
||||||
|
is_vm_network_error() {
|
||||||
|
local exit_code="${1:-0}"
|
||||||
|
local log_file="${2:-$VM_ERROR_LOG}"
|
||||||
|
|
||||||
|
# Network-related curl/wget exit codes
|
||||||
|
case "$exit_code" in
|
||||||
|
6 | 7 | 28 | 52 | 56) return 0 ;;
|
||||||
|
esac
|
||||||
|
|
||||||
|
if [[ -s "$log_file" ]]; then
|
||||||
|
if grep -qiE "Name or service not known|Temporary failure in name resolution|Network is unreachable|No route to host|DNS.*failed|could not resolve" "$log_file" 2>/dev/null; then
|
||||||
|
return 0
|
||||||
|
fi
|
||||||
|
fi
|
||||||
|
return 1
|
||||||
|
}
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------------------
|
||||||
|
# vm_classify_error()
|
||||||
|
#
|
||||||
|
# - Classifies a VM creation error into a category
|
||||||
|
# - Order matters: most specific checks first
|
||||||
|
# - Returns category string via stdout
|
||||||
|
# - Categories: vmid_conflict, storage_full, download, disk_import,
|
||||||
|
# virt_customize, network, unknown
|
||||||
|
# ------------------------------------------------------------------------------
|
||||||
|
vm_classify_error() {
|
||||||
|
local exit_code="${1:-0}"
|
||||||
|
local log_file="${2:-$VM_ERROR_LOG}"
|
||||||
|
|
||||||
|
if is_vm_vmid_conflict "$exit_code" "$log_file"; then
|
||||||
|
echo "vmid_conflict"
|
||||||
|
elif is_vm_storage_full "$exit_code" "$log_file"; then
|
||||||
|
echo "storage_full"
|
||||||
|
elif is_vm_download_error "$exit_code" "$log_file"; then
|
||||||
|
echo "download"
|
||||||
|
elif is_vm_disk_import_error "$exit_code" "$log_file"; then
|
||||||
|
echo "disk_import"
|
||||||
|
elif is_vm_virt_customize_error "$exit_code" "$log_file"; then
|
||||||
|
echo "virt_customize"
|
||||||
|
elif is_vm_network_error "$exit_code" "$log_file"; then
|
||||||
|
echo "network"
|
||||||
|
else
|
||||||
|
echo "unknown"
|
||||||
|
fi
|
||||||
|
}
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------------------
|
||||||
|
# vm_show_recovery_menu()
|
||||||
|
#
|
||||||
|
# - Displays a whiptail menu with recovery options after a VM creation failure
|
||||||
|
# - Options are dynamically built based on error category
|
||||||
|
# - Returns the selected option via stdout
|
||||||
|
# - Arguments:
|
||||||
|
# $1: exit_code
|
||||||
|
# $2: error_category (from vm_classify_error)
|
||||||
|
# $3: current attempt number
|
||||||
|
# ------------------------------------------------------------------------------
|
||||||
|
vm_show_recovery_menu() {
|
||||||
|
local exit_code="${1:-1}"
|
||||||
|
local error_category="${2:-unknown}"
|
||||||
|
local attempt="${3:-1}"
|
||||||
|
|
||||||
|
local menu_items=()
|
||||||
|
local menu_height=12
|
||||||
|
local item_count=0
|
||||||
|
|
||||||
|
# --- Dynamic options based on error category ---
|
||||||
|
|
||||||
|
# Retry (always available unless max retries reached)
|
||||||
|
if ((attempt < VM_MAX_RETRIES)); then
|
||||||
|
case "$error_category" in
|
||||||
|
download)
|
||||||
|
menu_items+=("RETRY_DOWNLOAD" "🔄 Retry download (clear cache & re-download)" "ON")
|
||||||
|
((item_count++))
|
||||||
|
;;
|
||||||
|
disk_import)
|
||||||
|
menu_items+=("RETRY" "🔄 Retry VM creation" "ON")
|
||||||
|
((item_count++))
|
||||||
|
;;
|
||||||
|
virt_customize)
|
||||||
|
menu_items+=("RETRY" "🔄 Retry VM creation" "ON")
|
||||||
|
((item_count++))
|
||||||
|
menu_items+=("SKIP_CUSTOMIZE" "⏭️ Skip virt-customize (use first-boot fallback)" "OFF")
|
||||||
|
((item_count++))
|
||||||
|
;;
|
||||||
|
network)
|
||||||
|
menu_items+=("RETRY" "🔄 Retry VM creation" "ON")
|
||||||
|
((item_count++))
|
||||||
|
;;
|
||||||
|
vmid_conflict)
|
||||||
|
menu_items+=("NEW_VMID" "🆔 Choose a different VM ID" "ON")
|
||||||
|
((item_count++))
|
||||||
|
;;
|
||||||
|
storage_full)
|
||||||
|
menu_items+=("RETRY_SETTINGS" "⚙️ Retry with different settings (storage/disk)" "ON")
|
||||||
|
((item_count++))
|
||||||
|
;;
|
||||||
|
*)
|
||||||
|
menu_items+=("RETRY" "🔄 Retry VM creation" "ON")
|
||||||
|
((item_count++))
|
||||||
|
;;
|
||||||
|
esac
|
||||||
|
|
||||||
|
# Retry with different resources (always offered)
|
||||||
|
menu_items+=("RETRY_SETTINGS" "⚙️ Retry with different settings (RAM/CPU/Disk)" "OFF")
|
||||||
|
((item_count++))
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Keep VM for debugging (always available)
|
||||||
|
menu_items+=("KEEP" "🔍 Keep partial VM for manual debugging" "OFF")
|
||||||
|
((item_count++))
|
||||||
|
|
||||||
|
# Abort (always available)
|
||||||
|
menu_items+=("ABORT" "❌ Destroy VM and exit" "OFF")
|
||||||
|
((item_count++))
|
||||||
|
|
||||||
|
menu_height=$((item_count + 10))
|
||||||
|
|
||||||
|
# Error info for title
|
||||||
|
local title="VM CREATION FAILED"
|
||||||
|
local body="Exit code: ${exit_code} | Category: ${error_category}\nAttempt: ${attempt}/${VM_MAX_RETRIES}\n\nChoose a recovery action:"
|
||||||
|
|
||||||
|
if ((attempt >= VM_MAX_RETRIES)); then
|
||||||
|
body="Exit code: ${exit_code} | Category: ${error_category}\n⚠️ Maximum retries (${VM_MAX_RETRIES}) reached.\n\nChoose an action:"
|
||||||
|
fi
|
||||||
|
|
||||||
|
local choice
|
||||||
|
choice=$(whiptail --backtitle "Proxmox VE Helper Scripts" --title "$title" \
|
||||||
|
--radiolist "$body" "$menu_height" 72 "$item_count" \
|
||||||
|
"${menu_items[@]}" 3>&1 1>&2 2>&3) || choice="ABORT"
|
||||||
|
|
||||||
|
echo "$choice"
|
||||||
|
}
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------------------
|
||||||
|
# vm_handle_recovery()
|
||||||
|
#
|
||||||
|
# - Main recovery handler called from error_handler or a wrapper
|
||||||
|
# - Classifies the error, shows recovery menu, and executes the chosen action
|
||||||
|
# - Arguments:
|
||||||
|
# $1: exit_code
|
||||||
|
# $2: line_number
|
||||||
|
# $3: failed_command
|
||||||
|
# $4: cleanup_fn — function to call for VM cleanup (default: cleanup_vmid)
|
||||||
|
# $5: retry_fn — function to re-invoke for full retry (required for retry)
|
||||||
|
# - Uses global: VM_ERROR_LOG, VM_RECOVERY_ATTEMPT, VM_MAX_RETRIES, VMID
|
||||||
|
# - Returns: 0 if retry was chosen (caller should re-run), 1 if abort/keep
|
||||||
|
# ------------------------------------------------------------------------------
|
||||||
|
vm_handle_recovery() {
|
||||||
|
local exit_code="${1:-1}"
|
||||||
|
local line_number="${2:-?}"
|
||||||
|
local failed_command="${3:-unknown}"
|
||||||
|
local cleanup_fn="${4:-cleanup_vmid}"
|
||||||
|
local retry_fn="${5:-}"
|
||||||
|
|
||||||
|
# Stop any running spinner
|
||||||
|
stop_spinner 2>/dev/null || true
|
||||||
|
|
||||||
|
# Classify the error
|
||||||
|
local error_category
|
||||||
|
error_category=$(vm_classify_error "$exit_code" "$VM_ERROR_LOG")
|
||||||
|
|
||||||
|
((VM_RECOVERY_ATTEMPT++))
|
||||||
|
|
||||||
|
# Show error details
|
||||||
|
echo ""
|
||||||
|
msg_error "VM creation failed in line ${line_number}"
|
||||||
|
msg_error "Exit code: ${exit_code} | Category: ${error_category}"
|
||||||
|
msg_error "Command: ${failed_command}"
|
||||||
|
|
||||||
|
# Show last few lines of error log if available
|
||||||
|
if [[ -s "$VM_ERROR_LOG" ]]; then
|
||||||
|
echo -e "\n${TAB}${YW}--- Last 5 lines of error log ---${CL}"
|
||||||
|
tail -n 5 "$VM_ERROR_LOG" 2>/dev/null | while IFS= read -r line; do
|
||||||
|
echo -e "${TAB} ${line}"
|
||||||
|
done
|
||||||
|
echo -e "${TAB}${YW}----------------------------------${CL}\n"
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Show recovery menu
|
||||||
|
local choice
|
||||||
|
choice=$(vm_show_recovery_menu "$exit_code" "$error_category" "$VM_RECOVERY_ATTEMPT")
|
||||||
|
|
||||||
|
case "$choice" in
|
||||||
|
RETRY | RETRY_DOWNLOAD)
|
||||||
|
msg_info "Cleaning up failed VM ${VMID} for retry"
|
||||||
|
"$cleanup_fn" 2>/dev/null || true
|
||||||
|
rm -f "$VM_ERROR_LOG"
|
||||||
|
rm -f "${WORK_FILE:-}" 2>/dev/null
|
||||||
|
|
||||||
|
if [[ "$choice" == "RETRY_DOWNLOAD" ]]; then
|
||||||
|
# Clear cached image
|
||||||
|
if [[ -n "${CACHE_FILE:-}" && -f "$CACHE_FILE" ]]; then
|
||||||
|
msg_info "Clearing cached image: $(basename "$CACHE_FILE")"
|
||||||
|
rm -f "$CACHE_FILE"
|
||||||
|
msg_ok "Cache cleared"
|
||||||
|
fi
|
||||||
|
fi
|
||||||
|
|
||||||
|
msg_ok "Ready for retry (attempt $((VM_RECOVERY_ATTEMPT + 1))/${VM_MAX_RETRIES})"
|
||||||
|
|
||||||
|
if [[ -n "$retry_fn" ]]; then
|
||||||
|
# Re-invoke the retry function — caller loop handles this
|
||||||
|
return 0
|
||||||
|
else
|
||||||
|
msg_warn "No retry function provided — please re-run the script manually"
|
||||||
|
return 1
|
||||||
|
fi
|
||||||
|
;;
|
||||||
|
|
||||||
|
SKIP_CUSTOMIZE)
|
||||||
|
msg_info "Cleaning up failed VM ${VMID} for retry (skipping virt-customize)"
|
||||||
|
"$cleanup_fn" 2>/dev/null || true
|
||||||
|
rm -f "$VM_ERROR_LOG"
|
||||||
|
rm -f "${WORK_FILE:-}" 2>/dev/null
|
||||||
|
# Set flag so docker-vm.sh skips virt-customize
|
||||||
|
export SKIP_VIRT_CUSTOMIZE="yes"
|
||||||
|
msg_ok "Will use first-boot fallback for package installation"
|
||||||
|
|
||||||
|
if [[ -n "$retry_fn" ]]; then
|
||||||
|
return 0
|
||||||
|
else
|
||||||
|
msg_warn "No retry function provided — please re-run the script manually"
|
||||||
|
return 1
|
||||||
|
fi
|
||||||
|
;;
|
||||||
|
|
||||||
|
RETRY_SETTINGS)
|
||||||
|
msg_info "Cleaning up failed VM ${VMID} for retry with new settings"
|
||||||
|
"$cleanup_fn" 2>/dev/null || true
|
||||||
|
rm -f "$VM_ERROR_LOG"
|
||||||
|
rm -f "${WORK_FILE:-}" 2>/dev/null
|
||||||
|
|
||||||
|
# Let user choose new settings via advanced_settings if available
|
||||||
|
if declare -f advanced_settings >/dev/null 2>&1; then
|
||||||
|
header_info 2>/dev/null || true
|
||||||
|
echo -e "${ADVANCED:-}${BOLD}${RD}Reconfigure VM Settings${CL}"
|
||||||
|
advanced_settings
|
||||||
|
else
|
||||||
|
msg_warn "advanced_settings() not available — using current settings"
|
||||||
|
fi
|
||||||
|
|
||||||
|
if [[ -n "$retry_fn" ]]; then
|
||||||
|
return 0
|
||||||
|
else
|
||||||
|
msg_warn "No retry function provided — please re-run the script manually"
|
||||||
|
return 1
|
||||||
|
fi
|
||||||
|
;;
|
||||||
|
|
||||||
|
NEW_VMID)
|
||||||
|
msg_info "Cleaning up conflicting VM ${VMID}"
|
||||||
|
"$cleanup_fn" 2>/dev/null || true
|
||||||
|
rm -f "$VM_ERROR_LOG"
|
||||||
|
rm -f "${WORK_FILE:-}" 2>/dev/null
|
||||||
|
VMID=$(get_valid_nextid)
|
||||||
|
echo -e "${CONTAINERID:-}${BOLD}${DGN}New Virtual Machine ID: ${BGN}${VMID}${CL}"
|
||||||
|
msg_ok "Using new VMID: ${VMID}"
|
||||||
|
|
||||||
|
if [[ -n "$retry_fn" ]]; then
|
||||||
|
return 0
|
||||||
|
else
|
||||||
|
msg_warn "No retry function provided — please re-run the script manually"
|
||||||
|
return 1
|
||||||
|
fi
|
||||||
|
;;
|
||||||
|
|
||||||
|
KEEP)
|
||||||
|
msg_warn "Keeping partial VM ${VMID} for manual debugging"
|
||||||
|
msg_warn "You can inspect it with: qm config ${VMID}"
|
||||||
|
msg_warn "To remove it later: qm destroy ${VMID} --destroy-unreferenced-disks --purge"
|
||||||
|
# Report failure to telemetry
|
||||||
|
post_update_to_api "failed" "$exit_code" 2>/dev/null || true
|
||||||
|
exit "$exit_code"
|
||||||
|
;;
|
||||||
|
|
||||||
|
ABORT | *)
|
||||||
|
msg_info "Destroying failed VM ${VMID}"
|
||||||
|
"$cleanup_fn" 2>/dev/null || true
|
||||||
|
rm -f "$VM_ERROR_LOG"
|
||||||
|
post_update_to_api "failed" "$exit_code" 2>/dev/null || true
|
||||||
|
msg_error "VM creation aborted by user"
|
||||||
|
exit "$exit_code"
|
||||||
|
;;
|
||||||
|
esac
|
||||||
|
}
|
||||||
|
|||||||
@@ -65,13 +65,63 @@ trap 'error_handler $LINENO "$BASH_COMMAND"' ERR
|
|||||||
trap cleanup EXIT
|
trap cleanup EXIT
|
||||||
trap 'post_update_to_api "failed" "INTERRUPTED"' SIGINT
|
trap 'post_update_to_api "failed" "INTERRUPTED"' SIGINT
|
||||||
trap 'post_update_to_api "failed" "TERMINATED"' SIGTERM
|
trap 'post_update_to_api "failed" "TERMINATED"' SIGTERM
|
||||||
|
|
||||||
|
# Smart recovery state
|
||||||
|
VM_CREATION_PHASE="no"
|
||||||
|
VM_RECOVERY_ATTEMPT=0
|
||||||
|
VM_MAX_RETRIES=2
|
||||||
|
|
||||||
function error_handler() {
|
function error_handler() {
|
||||||
local exit_code="$?"
|
local exit_code="$?"
|
||||||
local line_number="$1"
|
local line_number="$1"
|
||||||
local command="$2"
|
local command="$2"
|
||||||
local error_message="${RD}[ERROR]${CL} in line ${RD}$line_number${CL}: exit code ${RD}$exit_code${CL}: while executing command ${YW}$command${CL}"
|
local error_message="${RD}[ERROR]${CL} in line ${RD}$line_number${CL}: exit code ${RD}$exit_code${CL}: while executing command ${YW}$command${CL}"
|
||||||
post_update_to_api "failed" "${exit_code}"
|
|
||||||
echo -e "\n$error_message\n"
|
echo -e "\n$error_message\n"
|
||||||
|
|
||||||
|
# During VM creation phase: offer recovery menu instead of immediate cleanup
|
||||||
|
if [[ "$VM_CREATION_PHASE" == "yes" && $VM_RECOVERY_ATTEMPT -lt $VM_MAX_RETRIES ]]; then
|
||||||
|
((VM_RECOVERY_ATTEMPT++))
|
||||||
|
trap - ERR
|
||||||
|
set +e
|
||||||
|
|
||||||
|
local choice
|
||||||
|
choice=$(whiptail --backtitle "Proxmox VE Helper Scripts" --title "VM CREATION FAILED" \
|
||||||
|
--radiolist "Exit code: ${exit_code} | Attempt: ${VM_RECOVERY_ATTEMPT}/${VM_MAX_RETRIES}\nFailed command: ${command}\n\nChoose a recovery action:" 16 72 4 \
|
||||||
|
"RETRY" "Retry VM creation" "ON" \
|
||||||
|
"SKIP_CUSTOMIZE" "Retry and skip image customization" "OFF" \
|
||||||
|
"KEEP" "Keep partial VM for debugging" "OFF" \
|
||||||
|
"ABORT" "Destroy VM and exit" "OFF" \
|
||||||
|
3>&1 1>&2 2>&3) || choice="ABORT"
|
||||||
|
|
||||||
|
case "$choice" in
|
||||||
|
RETRY | SKIP_CUSTOMIZE)
|
||||||
|
msg_info "Cleaning up failed VM ${VMID} for retry"
|
||||||
|
cleanup_vmid 2>/dev/null || true
|
||||||
|
rm -f "${WORK_FILE:-}" 2>/dev/null
|
||||||
|
[[ "$choice" == "SKIP_CUSTOMIZE" ]] && export SKIP_VIRT_CUSTOMIZE="yes"
|
||||||
|
msg_ok "Ready for retry (attempt $((VM_RECOVERY_ATTEMPT + 1))/${VM_MAX_RETRIES})"
|
||||||
|
set -e
|
||||||
|
trap 'error_handler $LINENO "$BASH_COMMAND"' ERR
|
||||||
|
create_vm
|
||||||
|
exit $?
|
||||||
|
;;
|
||||||
|
KEEP)
|
||||||
|
echo -e "\n${YW} Keeping partial VM ${VMID} for debugging${CL}"
|
||||||
|
echo -e " Inspect: qm config ${VMID}"
|
||||||
|
echo -e " Remove: qm destroy ${VMID} --destroy-unreferenced-disks --purge\n"
|
||||||
|
post_update_to_api "failed" "$exit_code"
|
||||||
|
exit "$exit_code"
|
||||||
|
;;
|
||||||
|
*)
|
||||||
|
post_update_to_api "failed" "$exit_code"
|
||||||
|
cleanup_vmid
|
||||||
|
exit "$exit_code"
|
||||||
|
;;
|
||||||
|
esac
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Default: no recovery (max retries exceeded or outside creation phase)
|
||||||
|
post_update_to_api "failed" "${exit_code}"
|
||||||
cleanup_vmid
|
cleanup_vmid
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -485,6 +535,7 @@ fi
|
|||||||
msg_ok "Using ${CL}${BL}$STORAGE${CL} ${GN}for Storage Location."
|
msg_ok "Using ${CL}${BL}$STORAGE${CL} ${GN}for Storage Location."
|
||||||
msg_ok "Virtual Machine ID is ${CL}${BL}$VMID${CL}."
|
msg_ok "Virtual Machine ID is ${CL}${BL}$VMID${CL}."
|
||||||
|
|
||||||
|
create_vm() {
|
||||||
# ==============================================================================
|
# ==============================================================================
|
||||||
# PREREQUISITES
|
# PREREQUISITES
|
||||||
# ==============================================================================
|
# ==============================================================================
|
||||||
@@ -511,11 +562,12 @@ msg_ok "Downloaded ${CL}${BL}${FILE}${CL}"
|
|||||||
# ==============================================================================
|
# ==============================================================================
|
||||||
# IMAGE CUSTOMIZATION
|
# IMAGE CUSTOMIZATION
|
||||||
# ==============================================================================
|
# ==============================================================================
|
||||||
msg_info "Customizing ${FILE} image"
|
|
||||||
|
|
||||||
WORK_FILE=$(mktemp --suffix=.qcow2)
|
WORK_FILE=$(mktemp --suffix=.qcow2)
|
||||||
cp "$FILE" "$WORK_FILE"
|
cp "$FILE" "$WORK_FILE"
|
||||||
|
|
||||||
|
if [[ "${SKIP_VIRT_CUSTOMIZE:-}" != "yes" ]]; then
|
||||||
|
msg_info "Customizing ${FILE} image"
|
||||||
|
|
||||||
# Set hostname
|
# Set hostname
|
||||||
virt-customize -q -a "$WORK_FILE" --hostname "${HN}" >/dev/null 2>&1
|
virt-customize -q -a "$WORK_FILE" --hostname "${HN}" >/dev/null 2>&1
|
||||||
|
|
||||||
@@ -551,6 +603,9 @@ EOF' >/dev/null 2>&1 || true
|
|||||||
fi
|
fi
|
||||||
|
|
||||||
msg_ok "Customized image"
|
msg_ok "Customized image"
|
||||||
|
else
|
||||||
|
msg_ok "Skipped image customization (hostname and login not pre-configured)"
|
||||||
|
fi
|
||||||
|
|
||||||
STORAGE_TYPE=$(pvesm status -storage "$STORAGE" | awk 'NR>1 {print $2}')
|
STORAGE_TYPE=$(pvesm status -storage "$STORAGE" | awk 'NR>1 {print $2}')
|
||||||
case $STORAGE_TYPE in
|
case $STORAGE_TYPE in
|
||||||
@@ -650,3 +705,8 @@ fi
|
|||||||
|
|
||||||
msg_ok "Completed successfully!\n"
|
msg_ok "Completed successfully!\n"
|
||||||
echo "More Info at https://github.com/community-scripts/ProxmoxVE/discussions/836"
|
echo "More Info at https://github.com/community-scripts/ProxmoxVE/discussions/836"
|
||||||
|
} # end create_vm
|
||||||
|
|
||||||
|
VM_CREATION_PHASE="yes"
|
||||||
|
create_vm
|
||||||
|
VM_CREATION_PHASE="no"
|
||||||
|
|||||||
@@ -40,10 +40,32 @@ trap cleanup EXIT
|
|||||||
trap 'post_update_to_api "failed" "INTERRUPTED"' SIGINT
|
trap 'post_update_to_api "failed" "INTERRUPTED"' SIGINT
|
||||||
trap 'post_update_to_api "failed" "TERMINATED"' SIGTERM
|
trap 'post_update_to_api "failed" "TERMINATED"' SIGTERM
|
||||||
|
|
||||||
|
# Flag to control whether recovery menu is shown (set during create_vm)
|
||||||
|
VM_CREATION_PHASE="no"
|
||||||
|
|
||||||
function error_handler() {
|
function error_handler() {
|
||||||
local exit_code="$?"
|
local exit_code="$?"
|
||||||
local line_number="$1"
|
local line_number="$1"
|
||||||
local command="$2"
|
local command="$2"
|
||||||
|
|
||||||
|
# During VM creation phase: use smart recovery if available
|
||||||
|
if [[ "$VM_CREATION_PHASE" == "yes" ]] && declare -f vm_handle_recovery >/dev/null 2>&1; then
|
||||||
|
# Temporarily disable ERR trap + set -e to prevent recursion during recovery menu
|
||||||
|
trap - ERR
|
||||||
|
set +e
|
||||||
|
|
||||||
|
if vm_handle_recovery "$exit_code" "$line_number" "$command" "cleanup_vmid" "create_vm"; then
|
||||||
|
# Recovery chose retry — re-invoke create_vm with traps restored
|
||||||
|
set -e
|
||||||
|
trap 'error_handler $LINENO "$BASH_COMMAND"' ERR
|
||||||
|
create_vm
|
||||||
|
exit $?
|
||||||
|
fi
|
||||||
|
# Recovery chose abort/keep — vm_handle_recovery already called exit
|
||||||
|
exit "$exit_code"
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Default error handling (outside VM creation phase)
|
||||||
local error_message="${RD}[ERROR]${CL} in line ${RD}$line_number${CL}: exit code ${RD}$exit_code${CL}: while executing command ${YW}$command${CL}"
|
local error_message="${RD}[ERROR]${CL} in line ${RD}$line_number${CL}: exit code ${RD}$exit_code${CL}: while executing command ${YW}$command${CL}"
|
||||||
post_update_to_api "failed" "${exit_code}"
|
post_update_to_api "failed" "${exit_code}"
|
||||||
echo -e "\n$error_message\n"
|
echo -e "\n$error_message\n"
|
||||||
@@ -436,6 +458,15 @@ if ! command -v virt-customize &>/dev/null; then
|
|||||||
msg_ok "Installed libguestfs-tools"
|
msg_ok "Installed libguestfs-tools"
|
||||||
fi
|
fi
|
||||||
|
|
||||||
|
# ==============================================================================
|
||||||
|
# VM CREATION FUNCTION (wrapped for smart recovery retry)
|
||||||
|
# ==============================================================================
|
||||||
|
create_vm() {
|
||||||
|
|
||||||
|
# Reset error log for this attempt
|
||||||
|
VM_ERROR_LOG="/tmp/vm-install-${VMID}.log"
|
||||||
|
: >"$VM_ERROR_LOG"
|
||||||
|
|
||||||
# ==============================================================================
|
# ==============================================================================
|
||||||
# IMAGE DOWNLOAD
|
# IMAGE DOWNLOAD
|
||||||
# ==============================================================================
|
# ==============================================================================
|
||||||
@@ -492,8 +523,12 @@ export LIBGUESTFS_BACKEND_SETTINGS=dns=8.8.8.8,1.1.1.1
|
|||||||
DOCKER_PREINSTALLED="no"
|
DOCKER_PREINSTALLED="no"
|
||||||
|
|
||||||
# Install qemu-guest-agent and Docker during image customization
|
# Install qemu-guest-agent and Docker during image customization
|
||||||
|
# Skip if recovery set SKIP_VIRT_CUSTOMIZE (virt-customize failed before)
|
||||||
|
if [[ "${SKIP_VIRT_CUSTOMIZE:-}" == "yes" ]]; then
|
||||||
|
msg_ok "Skipping virt-customize (using first-boot fallback)"
|
||||||
|
else
|
||||||
msg_info "Installing base packages in image"
|
msg_info "Installing base packages in image"
|
||||||
if virt-customize -a "$WORK_FILE" --install qemu-guest-agent,curl,ca-certificates >/dev/null 2>&1; then
|
if virt-customize -a "$WORK_FILE" --install qemu-guest-agent,curl,ca-certificates 2>>"$VM_ERROR_LOG" >/dev/null; then
|
||||||
msg_ok "Installed base packages"
|
msg_ok "Installed base packages"
|
||||||
|
|
||||||
msg_info "Installing Docker (this may take 2-5 minutes)"
|
msg_info "Installing Docker (this may take 2-5 minutes)"
|
||||||
@@ -522,6 +557,7 @@ EOF' >/dev/null 2>&1
|
|||||||
else
|
else
|
||||||
msg_ok "Packages will be installed on first boot"
|
msg_ok "Packages will be installed on first boot"
|
||||||
fi
|
fi
|
||||||
|
fi
|
||||||
|
|
||||||
msg_info "Finalizing image (hostname, SSH config)"
|
msg_info "Finalizing image (hostname, SSH config)"
|
||||||
# Set hostname and prepare for unique machine-id
|
# Set hostname and prepare for unique machine-id
|
||||||
@@ -612,7 +648,7 @@ msg_ok "Resized disk image"
|
|||||||
msg_info "Creating Docker VM shell"
|
msg_info "Creating Docker VM shell"
|
||||||
|
|
||||||
qm create $VMID -agent 1${MACHINE} -tablet 0 -localtime 1 -bios ovmf${CPU_TYPE} -cores $CORE_COUNT -memory $RAM_SIZE \
|
qm create $VMID -agent 1${MACHINE} -tablet 0 -localtime 1 -bios ovmf${CPU_TYPE} -cores $CORE_COUNT -memory $RAM_SIZE \
|
||||||
-name $HN -tags community-script -net0 virtio,bridge=$BRG,macaddr=$MAC$VLAN$MTU -onboot 1 -ostype l26 -scsihw virtio-scsi-pci >/dev/null
|
-name $HN -tags community-script -net0 virtio,bridge=$BRG,macaddr=$MAC$VLAN$MTU -onboot 1 -ostype l26 -scsihw virtio-scsi-pci 2>>"$VM_ERROR_LOG" >/dev/null
|
||||||
|
|
||||||
msg_ok "Created VM shell"
|
msg_ok "Created VM shell"
|
||||||
|
|
||||||
@@ -627,7 +663,7 @@ else
|
|||||||
IMPORT_CMD=(qm importdisk)
|
IMPORT_CMD=(qm importdisk)
|
||||||
fi
|
fi
|
||||||
|
|
||||||
IMPORT_OUT="$("${IMPORT_CMD[@]}" "$VMID" "$WORK_FILE" "$STORAGE" ${DISK_IMPORT:-} 2>&1 || true)"
|
IMPORT_OUT="$("${IMPORT_CMD[@]}" "$VMID" "$WORK_FILE" "$STORAGE" ${DISK_IMPORT:-} 2> >(tee -a "$VM_ERROR_LOG") || true)"
|
||||||
DISK_REF_IMPORTED="$(printf '%s\n' "$IMPORT_OUT" | sed -n "s/.*successfully imported disk '\([^']\+\)'.*/\1/p" | tr -d "\r\"'")"
|
DISK_REF_IMPORTED="$(printf '%s\n' "$IMPORT_OUT" | sed -n "s/.*successfully imported disk '\([^']\+\)'.*/\1/p" | tr -d "\r\"'")"
|
||||||
[[ -z "$DISK_REF_IMPORTED" ]] && DISK_REF_IMPORTED="$(pvesm list "$STORAGE" | awk -v id="$VMID" '$5 ~ ("vm-"id"-disk-") {print $1":"$5}' | sort | tail -n1)"
|
[[ -z "$DISK_REF_IMPORTED" ]] && DISK_REF_IMPORTED="$(pvesm list "$STORAGE" | awk -v id="$VMID" '$5 ~ ("vm-"id"-disk-") {print $1":"$5}' | sort | tail -n1)"
|
||||||
[[ -z "$DISK_REF_IMPORTED" ]] && {
|
[[ -z "$DISK_REF_IMPORTED" ]] && {
|
||||||
@@ -709,3 +745,13 @@ fi
|
|||||||
|
|
||||||
post_update_to_api "done" "none"
|
post_update_to_api "done" "none"
|
||||||
msg_ok "Completed successfully!\n"
|
msg_ok "Completed successfully!\n"
|
||||||
|
|
||||||
|
} # end of create_vm()
|
||||||
|
|
||||||
|
# ==============================================================================
|
||||||
|
# VM CREATION WITH SMART RECOVERY
|
||||||
|
# ==============================================================================
|
||||||
|
VM_CREATION_PHASE="yes"
|
||||||
|
create_vm
|
||||||
|
VM_CREATION_PHASE="no"
|
||||||
|
rm -f "$VM_ERROR_LOG" 2>/dev/null || true
|
||||||
|
|||||||
@@ -69,13 +69,65 @@ trap 'error_handler $LINENO "$BASH_COMMAND"' ERR
|
|||||||
trap cleanup EXIT
|
trap cleanup EXIT
|
||||||
trap 'post_update_to_api "failed" "INTERRUPTED"' SIGINT
|
trap 'post_update_to_api "failed" "INTERRUPTED"' SIGINT
|
||||||
trap 'post_update_to_api "failed" "TERMINATED"' SIGTERM
|
trap 'post_update_to_api "failed" "TERMINATED"' SIGTERM
|
||||||
|
|
||||||
|
# Smart recovery state
|
||||||
|
VM_CREATION_PHASE="no"
|
||||||
|
VM_RECOVERY_ATTEMPT=0
|
||||||
|
VM_MAX_RETRIES=2
|
||||||
|
|
||||||
function error_handler() {
|
function error_handler() {
|
||||||
local exit_code="$?"
|
local exit_code="$?"
|
||||||
local line_number="$1"
|
local line_number="$1"
|
||||||
local command="$2"
|
local command="$2"
|
||||||
local error_message="${RD}[ERROR]${CL} in line ${RD}$line_number${CL}: exit code ${RD}$exit_code${CL}: while executing command ${YW}$command${CL}"
|
local error_message="${RD}[ERROR]${CL} in line ${RD}$line_number${CL}: exit code ${RD}$exit_code${CL}: while executing command ${YW}$command${CL}"
|
||||||
post_update_to_api "failed" "${exit_code}"
|
|
||||||
echo -e "\n$error_message\n"
|
echo -e "\n$error_message\n"
|
||||||
|
|
||||||
|
# During VM creation phase: offer recovery menu instead of immediate cleanup
|
||||||
|
if [[ "$VM_CREATION_PHASE" == "yes" && $VM_RECOVERY_ATTEMPT -lt $VM_MAX_RETRIES ]]; then
|
||||||
|
((VM_RECOVERY_ATTEMPT++))
|
||||||
|
trap - ERR
|
||||||
|
set +e
|
||||||
|
|
||||||
|
local choice
|
||||||
|
choice=$(whiptail --backtitle "Proxmox VE Helper Scripts" --title "VM CREATION FAILED" \
|
||||||
|
--radiolist "Exit code: ${exit_code} | Attempt: ${VM_RECOVERY_ATTEMPT}/${VM_MAX_RETRIES}\nFailed command: ${command}\n\nChoose a recovery action:" 16 72 4 \
|
||||||
|
"RETRY" "Retry VM creation" "ON" \
|
||||||
|
"RETRY_DOWNLOAD" "Retry with fresh download (clear cache)" "OFF" \
|
||||||
|
"KEEP" "Keep partial VM for debugging" "OFF" \
|
||||||
|
"ABORT" "Destroy VM and exit" "OFF" \
|
||||||
|
3>&1 1>&2 2>&3) || choice="ABORT"
|
||||||
|
|
||||||
|
case "$choice" in
|
||||||
|
RETRY | RETRY_DOWNLOAD)
|
||||||
|
msg_info "Cleaning up failed VM ${VMID} for retry"
|
||||||
|
cleanup_vmid 2>/dev/null || true
|
||||||
|
if [[ "$choice" == "RETRY_DOWNLOAD" && -n "${CACHE_FILE:-}" ]]; then
|
||||||
|
rm -f "$CACHE_FILE"
|
||||||
|
msg_ok "Cleared cached image"
|
||||||
|
fi
|
||||||
|
msg_ok "Ready for retry (attempt $((VM_RECOVERY_ATTEMPT + 1))/${VM_MAX_RETRIES})"
|
||||||
|
set -e
|
||||||
|
trap 'error_handler $LINENO "$BASH_COMMAND"' ERR
|
||||||
|
create_vm
|
||||||
|
exit $?
|
||||||
|
;;
|
||||||
|
KEEP)
|
||||||
|
echo -e "\n${YW} Keeping partial VM ${VMID} for debugging${CL}"
|
||||||
|
echo -e " Inspect: qm config ${VMID}"
|
||||||
|
echo -e " Remove: qm destroy ${VMID} --destroy-unreferenced-disks --purge\n"
|
||||||
|
post_update_to_api "failed" "$exit_code"
|
||||||
|
exit "$exit_code"
|
||||||
|
;;
|
||||||
|
*)
|
||||||
|
post_update_to_api "failed" "$exit_code"
|
||||||
|
cleanup_vmid
|
||||||
|
exit "$exit_code"
|
||||||
|
;;
|
||||||
|
esac
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Default: no recovery (max retries exceeded or outside creation phase)
|
||||||
|
post_update_to_api "failed" "${exit_code}"
|
||||||
cleanup_vmid
|
cleanup_vmid
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -554,6 +606,7 @@ fi
|
|||||||
msg_ok "Using ${CL}${BL}$STORAGE${CL} ${GN}for Storage Location."
|
msg_ok "Using ${CL}${BL}$STORAGE${CL} ${GN}for Storage Location."
|
||||||
msg_ok "Virtual Machine ID is ${CL}${BL}$VMID${CL}."
|
msg_ok "Virtual Machine ID is ${CL}${BL}$VMID${CL}."
|
||||||
|
|
||||||
|
create_vm() {
|
||||||
var_version="${BRANCH}"
|
var_version="${BRANCH}"
|
||||||
msg_info "Retrieving the URL for Home Assistant ${BRANCH} Disk Image"
|
msg_info "Retrieving the URL for Home Assistant ${BRANCH} Disk Image"
|
||||||
if [ "$BRANCH" == "$dev" ]; then
|
if [ "$BRANCH" == "$dev" ]; then
|
||||||
@@ -658,3 +711,8 @@ if [ "$START_VM" == "yes" ]; then
|
|||||||
fi
|
fi
|
||||||
post_update_to_api "done" "none"
|
post_update_to_api "done" "none"
|
||||||
msg_ok "Completed successfully!\n"
|
msg_ok "Completed successfully!\n"
|
||||||
|
} # end create_vm
|
||||||
|
|
||||||
|
VM_CREATION_PHASE="yes"
|
||||||
|
create_vm
|
||||||
|
VM_CREATION_PHASE="no"
|
||||||
|
|||||||
@@ -70,13 +70,61 @@ trap 'error_handler $LINENO "$BASH_COMMAND"' ERR
|
|||||||
trap cleanup EXIT
|
trap cleanup EXIT
|
||||||
trap 'post_update_to_api "failed" "INTERRUPTED"' SIGINT
|
trap 'post_update_to_api "failed" "INTERRUPTED"' SIGINT
|
||||||
trap 'post_update_to_api "failed" "TERMINATED"' SIGTERM
|
trap 'post_update_to_api "failed" "TERMINATED"' SIGTERM
|
||||||
|
|
||||||
|
# Smart recovery state
|
||||||
|
VM_CREATION_PHASE="no"
|
||||||
|
VM_RECOVERY_ATTEMPT=0
|
||||||
|
VM_MAX_RETRIES=2
|
||||||
|
|
||||||
function error_handler() {
|
function error_handler() {
|
||||||
local exit_code="$?"
|
local exit_code="$?"
|
||||||
local line_number="$1"
|
local line_number="$1"
|
||||||
local command="$2"
|
local command="$2"
|
||||||
post_update_to_api "failed" "$exit_code"
|
|
||||||
local error_message="${RD}[ERROR]${CL} in line ${RD}$line_number${CL}: exit code ${RD}$exit_code${CL}: while executing command ${YW}$command${CL}"
|
local error_message="${RD}[ERROR]${CL} in line ${RD}$line_number${CL}: exit code ${RD}$exit_code${CL}: while executing command ${YW}$command${CL}"
|
||||||
echo -e "\n$error_message\n"
|
echo -e "\n$error_message\n"
|
||||||
|
|
||||||
|
# During VM creation phase: offer recovery menu instead of immediate cleanup
|
||||||
|
if [[ "$VM_CREATION_PHASE" == "yes" && $VM_RECOVERY_ATTEMPT -lt $VM_MAX_RETRIES ]]; then
|
||||||
|
((VM_RECOVERY_ATTEMPT++))
|
||||||
|
trap - ERR
|
||||||
|
set +e
|
||||||
|
set +o pipefail
|
||||||
|
|
||||||
|
local choice
|
||||||
|
choice=$(whiptail --backtitle "Proxmox VE Helper Scripts" --title "VM CREATION FAILED" \
|
||||||
|
--radiolist "Exit code: ${exit_code} | Attempt: ${VM_RECOVERY_ATTEMPT}/${VM_MAX_RETRIES}\nFailed command: ${command}\n\nChoose a recovery action:" 16 72 3 \
|
||||||
|
"RETRY" "Retry VM creation" "ON" \
|
||||||
|
"KEEP" "Keep partial VM for debugging" "OFF" \
|
||||||
|
"ABORT" "Destroy VM and exit" "OFF" \
|
||||||
|
3>&1 1>&2 2>&3) || choice="ABORT"
|
||||||
|
|
||||||
|
case "$choice" in
|
||||||
|
RETRY)
|
||||||
|
msg_info "Cleaning up failed VM ${VMID} for retry"
|
||||||
|
cleanup_vmid 2>/dev/null || true
|
||||||
|
msg_ok "Ready for retry (attempt $((VM_RECOVERY_ATTEMPT + 1))/${VM_MAX_RETRIES})"
|
||||||
|
set -Eeo pipefail
|
||||||
|
trap 'error_handler $LINENO "$BASH_COMMAND"' ERR
|
||||||
|
create_vm
|
||||||
|
exit $?
|
||||||
|
;;
|
||||||
|
KEEP)
|
||||||
|
echo -e "\n${YW} Keeping partial VM ${VMID} for debugging${CL}"
|
||||||
|
echo -e " Inspect: qm config ${VMID}"
|
||||||
|
echo -e " Remove: qm destroy ${VMID} --destroy-unreferenced-disks --purge\n"
|
||||||
|
post_update_to_api "failed" "$exit_code"
|
||||||
|
exit "$exit_code"
|
||||||
|
;;
|
||||||
|
*)
|
||||||
|
post_update_to_api "failed" "$exit_code"
|
||||||
|
cleanup_vmid
|
||||||
|
exit "$exit_code"
|
||||||
|
;;
|
||||||
|
esac
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Default: no recovery (max retries exceeded or outside creation phase)
|
||||||
|
post_update_to_api "failed" "$exit_code"
|
||||||
cleanup_vmid
|
cleanup_vmid
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -520,6 +568,8 @@ else
|
|||||||
fi
|
fi
|
||||||
msg_ok "Using ${CL}${BL}$STORAGE${CL} ${GN}for Storage Location."
|
msg_ok "Using ${CL}${BL}$STORAGE${CL} ${GN}for Storage Location."
|
||||||
msg_ok "Virtual Machine ID is ${CL}${BL}$VMID${CL}."
|
msg_ok "Virtual Machine ID is ${CL}${BL}$VMID${CL}."
|
||||||
|
|
||||||
|
create_vm() {
|
||||||
msg_info "Getting URL for OpenWrt Disk Image"
|
msg_info "Getting URL for OpenWrt Disk Image"
|
||||||
|
|
||||||
response=$(curl -fsSL https://openwrt.org)
|
response=$(curl -fsSL https://openwrt.org)
|
||||||
@@ -661,3 +711,8 @@ if [ -z "$VLAN" ] && [ "$VLAN2" != "999" ]; then
|
|||||||
fi
|
fi
|
||||||
post_update_to_api "done" "none"
|
post_update_to_api "done" "none"
|
||||||
msg_ok "Completed Successfully!${VLAN_FINISH:+\n$VLAN_FINISH}"
|
msg_ok "Completed Successfully!${VLAN_FINISH:+\n$VLAN_FINISH}"
|
||||||
|
} # end create_vm
|
||||||
|
|
||||||
|
VM_CREATION_PHASE="yes"
|
||||||
|
create_vm
|
||||||
|
VM_CREATION_PHASE="no"
|
||||||
|
|||||||
@@ -62,13 +62,60 @@ trap 'error_handler $LINENO "$BASH_COMMAND"' ERR
|
|||||||
trap cleanup EXIT
|
trap cleanup EXIT
|
||||||
trap 'post_update_to_api "failed" "INTERRUPTED"' SIGINT
|
trap 'post_update_to_api "failed" "INTERRUPTED"' SIGINT
|
||||||
trap 'post_update_to_api "failed" "TERMINATED"' SIGTERM
|
trap 'post_update_to_api "failed" "TERMINATED"' SIGTERM
|
||||||
|
|
||||||
|
# Smart recovery state
|
||||||
|
VM_CREATION_PHASE="no"
|
||||||
|
VM_RECOVERY_ATTEMPT=0
|
||||||
|
VM_MAX_RETRIES=2
|
||||||
|
|
||||||
function error_handler() {
|
function error_handler() {
|
||||||
local exit_code="$?"
|
local exit_code="$?"
|
||||||
local line_number="$1"
|
local line_number="$1"
|
||||||
local command="$2"
|
local command="$2"
|
||||||
post_update_to_api "failed" "$exit_code"
|
|
||||||
local error_message="${RD}[ERROR]${CL} in line ${RD}$line_number${CL}: exit code ${RD}$exit_code${CL}: while executing command ${YW}$command${CL}"
|
local error_message="${RD}[ERROR]${CL} in line ${RD}$line_number${CL}: exit code ${RD}$exit_code${CL}: while executing command ${YW}$command${CL}"
|
||||||
echo -e "\n$error_message\n"
|
echo -e "\n$error_message\n"
|
||||||
|
|
||||||
|
# During VM creation phase: offer recovery menu instead of immediate cleanup
|
||||||
|
if [[ "$VM_CREATION_PHASE" == "yes" && $VM_RECOVERY_ATTEMPT -lt $VM_MAX_RETRIES ]]; then
|
||||||
|
((VM_RECOVERY_ATTEMPT++))
|
||||||
|
trap - ERR
|
||||||
|
set +e
|
||||||
|
|
||||||
|
local choice
|
||||||
|
choice=$(whiptail --backtitle "Proxmox VE Helper Scripts" --title "VM CREATION FAILED" \
|
||||||
|
--radiolist "Exit code: ${exit_code} | Attempt: ${VM_RECOVERY_ATTEMPT}/${VM_MAX_RETRIES}\nFailed command: ${command}\n\nChoose a recovery action:" 16 72 3 \
|
||||||
|
"RETRY" "Retry VM creation" "ON" \
|
||||||
|
"KEEP" "Keep partial VM for debugging" "OFF" \
|
||||||
|
"ABORT" "Destroy VM and exit" "OFF" \
|
||||||
|
3>&1 1>&2 2>&3) || choice="ABORT"
|
||||||
|
|
||||||
|
case "$choice" in
|
||||||
|
RETRY)
|
||||||
|
msg_info "Cleaning up failed VM ${VMID} for retry"
|
||||||
|
cleanup_vmid 2>/dev/null || true
|
||||||
|
msg_ok "Ready for retry (attempt $((VM_RECOVERY_ATTEMPT + 1))/${VM_MAX_RETRIES})"
|
||||||
|
set -e
|
||||||
|
trap 'error_handler $LINENO "$BASH_COMMAND"' ERR
|
||||||
|
create_vm
|
||||||
|
exit $?
|
||||||
|
;;
|
||||||
|
KEEP)
|
||||||
|
echo -e "\n${YW} Keeping partial VM ${VMID} for debugging${CL}"
|
||||||
|
echo -e " Inspect: qm config ${VMID}"
|
||||||
|
echo -e " Remove: qm destroy ${VMID} --destroy-unreferenced-disks --purge\n"
|
||||||
|
post_update_to_api "failed" "$exit_code"
|
||||||
|
exit "$exit_code"
|
||||||
|
;;
|
||||||
|
*)
|
||||||
|
post_update_to_api "failed" "$exit_code"
|
||||||
|
cleanup_vmid
|
||||||
|
exit "$exit_code"
|
||||||
|
;;
|
||||||
|
esac
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Default: no recovery (max retries exceeded or outside creation phase)
|
||||||
|
post_update_to_api "failed" "$exit_code"
|
||||||
cleanup_vmid
|
cleanup_vmid
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -466,6 +513,8 @@ else
|
|||||||
fi
|
fi
|
||||||
msg_ok "Using ${CL}${BL}$STORAGE${CL} ${GN}for Storage Location."
|
msg_ok "Using ${CL}${BL}$STORAGE${CL} ${GN}for Storage Location."
|
||||||
msg_ok "Virtual Machine ID is ${CL}${BL}$VMID${CL}."
|
msg_ok "Virtual Machine ID is ${CL}${BL}$VMID${CL}."
|
||||||
|
|
||||||
|
create_vm() {
|
||||||
msg_info "Retrieving the URL for the Ubuntu 22.04 Disk Image"
|
msg_info "Retrieving the URL for the Ubuntu 22.04 Disk Image"
|
||||||
URL=https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-amd64.img
|
URL=https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-amd64.img
|
||||||
sleep 2
|
sleep 2
|
||||||
@@ -562,3 +611,8 @@ post_update_to_api "done" "none"
|
|||||||
msg_ok "Completed successfully!\n"
|
msg_ok "Completed successfully!\n"
|
||||||
echo -e "Setup Cloud-Init before starting \n
|
echo -e "Setup Cloud-Init before starting \n
|
||||||
More info at https://github.com/community-scripts/ProxmoxVE/discussions/272 \n"
|
More info at https://github.com/community-scripts/ProxmoxVE/discussions/272 \n"
|
||||||
|
} # end create_vm
|
||||||
|
|
||||||
|
VM_CREATION_PHASE="yes"
|
||||||
|
create_vm
|
||||||
|
VM_CREATION_PHASE="no"
|
||||||
|
|||||||
@@ -65,13 +65,60 @@ trap 'error_handler $LINENO "$BASH_COMMAND"' ERR
|
|||||||
trap cleanup EXIT
|
trap cleanup EXIT
|
||||||
trap 'post_update_to_api "failed" "INTERRUPTED"' SIGINT
|
trap 'post_update_to_api "failed" "INTERRUPTED"' SIGINT
|
||||||
trap 'post_update_to_api "failed" "TERMINATED"' SIGTERM
|
trap 'post_update_to_api "failed" "TERMINATED"' SIGTERM
|
||||||
|
|
||||||
|
# Smart recovery state
|
||||||
|
VM_CREATION_PHASE="no"
|
||||||
|
VM_RECOVERY_ATTEMPT=0
|
||||||
|
VM_MAX_RETRIES=2
|
||||||
|
|
||||||
function error_handler() {
|
function error_handler() {
|
||||||
local exit_code="$?"
|
local exit_code="$?"
|
||||||
local line_number="$1"
|
local line_number="$1"
|
||||||
local command="$2"
|
local command="$2"
|
||||||
post_update_to_api "failed" "$exit_code"
|
|
||||||
local error_message="${RD}[ERROR]${CL} in line ${RD}$line_number${CL}: exit code ${RD}$exit_code${CL}: while executing command ${YW}$command${CL}"
|
local error_message="${RD}[ERROR]${CL} in line ${RD}$line_number${CL}: exit code ${RD}$exit_code${CL}: while executing command ${YW}$command${CL}"
|
||||||
echo -e "\n$error_message\n"
|
echo -e "\n$error_message\n"
|
||||||
|
|
||||||
|
# During VM creation phase: offer recovery menu instead of immediate cleanup
|
||||||
|
if [[ "$VM_CREATION_PHASE" == "yes" && $VM_RECOVERY_ATTEMPT -lt $VM_MAX_RETRIES ]]; then
|
||||||
|
((VM_RECOVERY_ATTEMPT++))
|
||||||
|
trap - ERR
|
||||||
|
set +e
|
||||||
|
|
||||||
|
local choice
|
||||||
|
choice=$(whiptail --backtitle "Proxmox VE Helper Scripts" --title "VM CREATION FAILED" \
|
||||||
|
--radiolist "Exit code: ${exit_code} | Attempt: ${VM_RECOVERY_ATTEMPT}/${VM_MAX_RETRIES}\nFailed command: ${command}\n\nChoose a recovery action:" 16 72 3 \
|
||||||
|
"RETRY" "Retry VM creation" "ON" \
|
||||||
|
"KEEP" "Keep partial VM for debugging" "OFF" \
|
||||||
|
"ABORT" "Destroy VM and exit" "OFF" \
|
||||||
|
3>&1 1>&2 2>&3) || choice="ABORT"
|
||||||
|
|
||||||
|
case "$choice" in
|
||||||
|
RETRY)
|
||||||
|
msg_info "Cleaning up failed VM ${VMID} for retry"
|
||||||
|
cleanup_vmid 2>/dev/null || true
|
||||||
|
msg_ok "Ready for retry (attempt $((VM_RECOVERY_ATTEMPT + 1))/${VM_MAX_RETRIES})"
|
||||||
|
set -e
|
||||||
|
trap 'error_handler $LINENO "$BASH_COMMAND"' ERR
|
||||||
|
create_vm
|
||||||
|
exit $?
|
||||||
|
;;
|
||||||
|
KEEP)
|
||||||
|
echo -e "\n${YW} Keeping partial VM ${VMID} for debugging${CL}"
|
||||||
|
echo -e " Inspect: qm config ${VMID}"
|
||||||
|
echo -e " Remove: qm destroy ${VMID} --destroy-unreferenced-disks --purge\n"
|
||||||
|
post_update_to_api "failed" "$exit_code"
|
||||||
|
exit "$exit_code"
|
||||||
|
;;
|
||||||
|
*)
|
||||||
|
post_update_to_api "failed" "$exit_code"
|
||||||
|
cleanup_vmid
|
||||||
|
exit "$exit_code"
|
||||||
|
;;
|
||||||
|
esac
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Default: no recovery (max retries exceeded or outside creation phase)
|
||||||
|
post_update_to_api "failed" "$exit_code"
|
||||||
cleanup_vmid
|
cleanup_vmid
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -468,6 +515,8 @@ else
|
|||||||
fi
|
fi
|
||||||
msg_ok "Using ${CL}${BL}$STORAGE${CL} ${GN}for Storage Location."
|
msg_ok "Using ${CL}${BL}$STORAGE${CL} ${GN}for Storage Location."
|
||||||
msg_ok "Virtual Machine ID is ${CL}${BL}$VMID${CL}."
|
msg_ok "Virtual Machine ID is ${CL}${BL}$VMID${CL}."
|
||||||
|
|
||||||
|
create_vm() {
|
||||||
msg_info "Retrieving the URL for the Ubuntu 24.04 Disk Image"
|
msg_info "Retrieving the URL for the Ubuntu 24.04 Disk Image"
|
||||||
URL=https://cloud-images.ubuntu.com/noble/current/noble-server-cloudimg-amd64.img
|
URL=https://cloud-images.ubuntu.com/noble/current/noble-server-cloudimg-amd64.img
|
||||||
sleep 2
|
sleep 2
|
||||||
@@ -564,3 +613,8 @@ post_update_to_api "done" "none"
|
|||||||
msg_ok "Completed successfully!\n"
|
msg_ok "Completed successfully!\n"
|
||||||
echo -e "Setup Cloud-Init before starting \n
|
echo -e "Setup Cloud-Init before starting \n
|
||||||
More info at https://github.com/community-scripts/ProxmoxVE/discussions/272 \n"
|
More info at https://github.com/community-scripts/ProxmoxVE/discussions/272 \n"
|
||||||
|
} # end create_vm
|
||||||
|
|
||||||
|
VM_CREATION_PHASE="yes"
|
||||||
|
create_vm
|
||||||
|
VM_CREATION_PHASE="no"
|
||||||
|
|||||||
@@ -64,13 +64,60 @@ trap 'error_handler $LINENO "$BASH_COMMAND"' ERR
|
|||||||
trap cleanup EXIT
|
trap cleanup EXIT
|
||||||
trap 'post_update_to_api "failed" "INTERRUPTED"' SIGINT
|
trap 'post_update_to_api "failed" "INTERRUPTED"' SIGINT
|
||||||
trap 'post_update_to_api "failed" "TERMINATED"' SIGTERM
|
trap 'post_update_to_api "failed" "TERMINATED"' SIGTERM
|
||||||
|
|
||||||
|
# Smart recovery state
|
||||||
|
VM_CREATION_PHASE="no"
|
||||||
|
VM_RECOVERY_ATTEMPT=0
|
||||||
|
VM_MAX_RETRIES=2
|
||||||
|
|
||||||
function error_handler() {
|
function error_handler() {
|
||||||
local exit_code="$?"
|
local exit_code="$?"
|
||||||
local line_number="$1"
|
local line_number="$1"
|
||||||
local command="$2"
|
local command="$2"
|
||||||
post_update_to_api "failed" "$exit_code"
|
|
||||||
local error_message="${RD}[ERROR]${CL} in line ${RD}$line_number${CL}: exit code ${RD}$exit_code${CL}: while executing command ${YW}$command${CL}"
|
local error_message="${RD}[ERROR]${CL} in line ${RD}$line_number${CL}: exit code ${RD}$exit_code${CL}: while executing command ${YW}$command${CL}"
|
||||||
echo -e "\n$error_message\n"
|
echo -e "\n$error_message\n"
|
||||||
|
|
||||||
|
# During VM creation phase: offer recovery menu instead of immediate cleanup
|
||||||
|
if [[ "$VM_CREATION_PHASE" == "yes" && $VM_RECOVERY_ATTEMPT -lt $VM_MAX_RETRIES ]]; then
|
||||||
|
((VM_RECOVERY_ATTEMPT++))
|
||||||
|
trap - ERR
|
||||||
|
set +e
|
||||||
|
|
||||||
|
local choice
|
||||||
|
choice=$(whiptail --backtitle "Proxmox VE Helper Scripts" --title "VM CREATION FAILED" \
|
||||||
|
--radiolist "Exit code: ${exit_code} | Attempt: ${VM_RECOVERY_ATTEMPT}/${VM_MAX_RETRIES}\nFailed command: ${command}\n\nChoose a recovery action:" 16 72 3 \
|
||||||
|
"RETRY" "Retry VM creation" "ON" \
|
||||||
|
"KEEP" "Keep partial VM for debugging" "OFF" \
|
||||||
|
"ABORT" "Destroy VM and exit" "OFF" \
|
||||||
|
3>&1 1>&2 2>&3) || choice="ABORT"
|
||||||
|
|
||||||
|
case "$choice" in
|
||||||
|
RETRY)
|
||||||
|
msg_info "Cleaning up failed VM ${VMID} for retry"
|
||||||
|
cleanup_vmid 2>/dev/null || true
|
||||||
|
msg_ok "Ready for retry (attempt $((VM_RECOVERY_ATTEMPT + 1))/${VM_MAX_RETRIES})"
|
||||||
|
set -e
|
||||||
|
trap 'error_handler $LINENO "$BASH_COMMAND"' ERR
|
||||||
|
create_vm
|
||||||
|
exit $?
|
||||||
|
;;
|
||||||
|
KEEP)
|
||||||
|
echo -e "\n${YW} Keeping partial VM ${VMID} for debugging${CL}"
|
||||||
|
echo -e " Inspect: qm config ${VMID}"
|
||||||
|
echo -e " Remove: qm destroy ${VMID} --destroy-unreferenced-disks --purge\n"
|
||||||
|
post_update_to_api "failed" "$exit_code"
|
||||||
|
exit "$exit_code"
|
||||||
|
;;
|
||||||
|
*)
|
||||||
|
post_update_to_api "failed" "$exit_code"
|
||||||
|
cleanup_vmid
|
||||||
|
exit "$exit_code"
|
||||||
|
;;
|
||||||
|
esac
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Default: no recovery (max retries exceeded or outside creation phase)
|
||||||
|
post_update_to_api "failed" "$exit_code"
|
||||||
cleanup_vmid
|
cleanup_vmid
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -467,6 +514,8 @@ else
|
|||||||
fi
|
fi
|
||||||
msg_ok "Using ${CL}${BL}$STORAGE${CL} ${GN}for Storage Location."
|
msg_ok "Using ${CL}${BL}$STORAGE${CL} ${GN}for Storage Location."
|
||||||
msg_ok "Virtual Machine ID is ${CL}${BL}$VMID${CL}."
|
msg_ok "Virtual Machine ID is ${CL}${BL}$VMID${CL}."
|
||||||
|
|
||||||
|
create_vm() {
|
||||||
msg_info "Retrieving the URL for the Ubuntu 25.04 Disk Image"
|
msg_info "Retrieving the URL for the Ubuntu 25.04 Disk Image"
|
||||||
URL=https://cloud-images.ubuntu.com/plucky/current/plucky-server-cloudimg-amd64.img
|
URL=https://cloud-images.ubuntu.com/plucky/current/plucky-server-cloudimg-amd64.img
|
||||||
sleep 2
|
sleep 2
|
||||||
@@ -563,3 +612,8 @@ post_update_to_api "done" "none"
|
|||||||
msg_ok "Completed successfully!\n"
|
msg_ok "Completed successfully!\n"
|
||||||
echo -e "Setup Cloud-Init before starting \n
|
echo -e "Setup Cloud-Init before starting \n
|
||||||
More info at https://github.com/community-scripts/ProxmoxVE/discussions/272 \n"
|
More info at https://github.com/community-scripts/ProxmoxVE/discussions/272 \n"
|
||||||
|
} # end create_vm
|
||||||
|
|
||||||
|
VM_CREATION_PHASE="yes"
|
||||||
|
create_vm
|
||||||
|
VM_CREATION_PHASE="no"
|
||||||
|
|||||||
Reference in New Issue
Block a user