Scientific Computing File Errors: Troubleshooting & Recovery Guide

Understanding Scientific Computing File Errors

Scientific computing relies heavily on specialized file formats designed to store complex numerical data, multidimensional arrays, simulation results, and research findings. These file formats often combine raw data with extensive metadata, enabling researchers to preserve not just results but also the context of experiments and analyses. When errors occur in these files, they can potentially compromise research integrity, delay publication, or lead to loss of irreplaceable experimental data.

This comprehensive guide addresses common file errors in scientific computing across various formats, including HDF5, NetCDF, MATLAB files, Jupyter notebooks, simulation outputs, and other research data formats. From corrupted headers and structural damage to version incompatibilities and metadata inconsistencies, we'll explore the typical issues researchers face when working with scientific data files. Whether you're a researcher, data scientist, engineer, or IT support for scientific computing, this guide provides detailed troubleshooting approaches and recovery techniques to help preserve valuable research data.

Common Scientific Computing File Formats

Before diving into specific errors, it's important to understand the various file formats commonly used in scientific computing:

HDF5 (.h5, .hdf5) - Hierarchical Data Format, a versatile format for storing large, complex datasets with rich metadata
NetCDF (.nc, .cdf) - Network Common Data Form, widely used in climate science, geosciences, and atmospheric research
MATLAB (.mat) - MATLAB's native format for storing workspace variables, widely used in engineering and signal processing
Jupyter Notebooks (.ipynb) - JSON-based format that combines code, output, visualizations, and markdown documentation
CSV/TSV (.csv, .tsv) - Simple tabular formats commonly used for data exchange
Parquet/Arrow (.parquet, .arrow) - Columnar storage formats optimized for big data analytics
FITS (.fits, .fit) - Flexible Image Transport System, standard in astronomy and astrophysics
NPY/NPZ (.npy, .npz) - NumPy's binary format for storing array data efficiently
Domain-specific formats - Formats like PDB (protein structures), GROMACS (molecular dynamics), or ROOT (particle physics)

Each format has specific structures, capabilities, and common issues. Understanding the format you're working with is crucial for effective troubleshooting.

Error #1: "HDF5 File Corrupted" or "Cannot Access HDF5 Dataset"

Symptoms

When attempting to open an HDF5 file, you may encounter error messages like "Unable to open HDF5 file," "HDF5 signature not found," or "Cannot read from dataset." Software may fail to load the file entirely, or it might load partially with missing datasets or groups.

Causes

File truncation during transfer or storage
Corrupted file headers or superblocks
Interrupted write operations
Storage media failures
Incompatible HDF5 library versions
Filesystem corruption
Network issues during remote access

Solutions

Solution 1: Verify File Integrity with h5check

Use the HDF5 validation tools to identify issues:

Run h5check to validate the file structure:
```
h5check filename.h5
```
Check for detailed error information about corruption location
For more information, use h5dump with error detection:
```
h5dump -pH filename.h5
```
Review error codes and specific problem areas

Solution 2: Recover with h5repack or h5copy

Try to extract salvageable data:

Use h5repack to create a clean copy of the file:
```
h5repack corrupted.h5 repaired.h5
```

If h5repack fails, try selective extraction with h5copy:

h5copy -i corrupted.h5 -o extracted.h5 -s /path/to/dataset -d /path/to/dataset

For partially accessible files, selectively copy individual datasets or groups that are still readable

Solution 3: Programmatic Recovery using High-Level Libraries

Use programming libraries with error handling:

Python example using h5py with error handling:

import h5py
import numpy as np
import traceback

# Create a new file for recovered data
recovered = h5py.File('recovered.h5', 'w')

# Try to open the corrupted file with read-only and error handling
try:
    with h5py.File('corrupted.h5', 'r', swmr=True) as f:
        # Function to recursively visit and try to copy groups/datasets
        def visit_and_recover(name, obj):
            try:
                if isinstance(obj, h5py.Group):
                    # Create group in the recovered file if it doesn't exist
                    if name not in recovered:
                        recovered.create_group(name)
                    print(f"Successfully copied group: {name}")
                elif isinstance(obj, h5py.Dataset):
                    # Try to read and copy the dataset
                    try:
                        data = obj[()]
                        # Recreate dataset in the recovered file
                        if name not in recovered:
                            recovered.create_dataset(name, data=data)
                        print(f"Successfully copied dataset: {name}")
                    except Exception as e:
                        print(f"Failed to recover dataset {name}: {str(e)}")
            except Exception as e:
                print(f"Error processing {name}: {str(e)}")
                
        # Visit all objects in the file
        f.visititems(visit_and_recover)
        
except Exception as e:
    print(f"Failed to open file: {str(e)}")
    traceback.print_exc()
    
finally:
    # Always close the recovered file
    recovered.close()
    print("Recovery attempt completed.")

Solution 4: Use Low-Level HDF5 Recovery Tools

For more severe corruption, try specialized approaches:

Check if the HDFGroup's recovery tools are applicable to your case:
- h5recover for superblock damage
- h5repair for selective block recovery
For corrupted metadata but intact raw data, consider byte-level extraction tools
Commercial data recovery services specializing in scientific formats may be able to help with severe corruption

Solution 5: Preventive Replication for Critical HDF5 Files

To avoid future data loss, implement protective measures:

Use h5repack periodically to clean and optimize important files:
```
h5repack -f GZIP=9 original.h5 optimized.h5
```

Implement checksumming for datasets:

h5repack -f FLETCHER32 original.h5 checksummed.h5

Consider storing critical data with redundancy using mirrored HDF5 files

Error #2: "NetCDF Read Error" or "Invalid Dimensions"

Symptoms

When working with NetCDF files, you may encounter errors like "NetCDF: Invalid dimensions," "NetCDF: Not a valid file format," or "Error accessing variable." Parts of the file may be inaccessible, or dimensional information may be inconsistent.

Causes

Incomplete file transfers
File header corruption
Version incompatibilities (NetCDF-3 vs. NetCDF-4)
Dimension or variable name corruption
Conflicts between dimensional definitions
Incorrect attribute types or values
Storage or networking issues during write operations

Solutions

Solution 1: NetCDF File Validation and Analysis

Analyze the file structure to identify issues:

Use ncdump to examine the file structure:
```
ncdump -h filename.nc
```
For more detailed checking, use nccheck or nc-verify tools
Check for specific error information pointing to corrupted sections
Verify version compatibility:
```
ncdump -k filename.nc
```

Solution 2: Convert Between NetCDF Versions

Address version incompatibility issues:

Convert NetCDF-4 to NetCDF-3:
```
ncks -3 input.nc output.nc
```
Convert NetCDF-3 to NetCDF-4:
```
ncks -4 input.nc output.nc
```
Try conversion with compression for optimized storage:
```
ncks -4 -L 4 input.nc compressed.nc
```
For specific file format issues, try forcing a format type:
```
ncks --fl_fmt=netcdf4_classic input.nc output.nc
```

Solution 3: Extract Variables and Rebuild the File

Salvage individual components from the damaged file:

Use NCO tools to extract variables selectively:

ncks -v variable_name input.nc extracted_var.nc

Extract dimension information and attributes:

ncks -v .dimension_name input.nc extracted_dim.nc

Merge salvaged components into a new file:

ncks -A extracted_var1.nc new.nc
ncks -A extracted_var2.nc new.nc

Solution 4: Programmatic NetCDF Repair with Python

Use the netCDF4 library for controlled file repair:

Python example for selective recovery:

import netCDF4 as nc
import numpy as np

# Open a new file for recovered data
recovered = nc.Dataset('recovered.nc', 'w')

try:
    # Try to open the corrupted file in read-only mode
    with nc.Dataset('corrupted.nc', 'r') as src:
        # Copy dimensions
        for dim_name, dimension in src.dimensions.items():
            try:
                recovered.createDimension(dim_name, len(dimension) if not dimension.isunlimited() else None)
                print(f"Copied dimension: {dim_name}")
            except Exception as e:
                print(f"Failed to copy dimension {dim_name}: {str(e)}")
        
        # Copy global attributes
        for attr_name in src.ncattrs():
            try:
                recovered.setncattr(attr_name, src.getncattr(attr_name))
                print(f"Copied global attribute: {attr_name}")
            except Exception as e:
                print(f"Failed to copy global attribute {attr_name}: {str(e)}")
        
        # Copy variables
        for var_name, variable in src.variables.items():
            try:
                # Create the variable in the new file
                var_type = variable.datatype
                var_dims = variable.dimensions
                var_out = recovered.createVariable(var_name, var_type, var_dims)
                
                # Copy variable attributes
                for attr_name in variable.ncattrs():
                    var_out.setncattr(attr_name, variable.getncattr(attr_name))
                
                # Copy the data
                var_out[:] = variable[:]
                print(f"Copied variable: {var_name}")
            except Exception as e:
                print(f"Failed to copy variable {var_name}: {str(e)}")
                
except Exception as e:
    print(f"Error opening corrupted file: {str(e)}")
    
finally:
    # Close the recovered file
    recovered.close()
    print("Recovery attempt completed.")

Solution 5: CDO and NCO Tools for Advanced Repair

Leverage climate data operators for recovery:

Use CDO to fix common NetCDF issues:
```
cdo copy input.nc fixed.nc
```

Repair time dimension issues:

cdo settaxis,yyyy-mm-dd,hh:mm:ss,timeunit input.nc fixed.nc

Fix grid definition problems:

cdo setgrid,gridfile.txt input.nc fixed.nc

Try selective data extraction and concatenation for corrupted timeseries:

cdo seldate,yyyy-mm-dd,yyyy-mm-dd input.nc part1.nc
cdo seldate,yyyy-mm-dd,yyyy-mm-dd input.nc part2.nc
cdo mergetime part1.nc part2.nc merged.nc

Error #3: "MATLAB File Format Error" or "MAT-File Variable Import"

Symptoms

When trying to load MATLAB (.mat) files, you may see error messages like "Invalid MAT-file," "Unable to read MAT-file header," or "Error reading variable from file." Variables may be missing, corrupted, or have incorrect types when loaded.

Causes

Version incompatibilities (MATLAB 5.0 vs. 7.3 formats)
Corrupted file headers
Partial file saves due to crashes
64-bit vs. 32-bit data storage issues
Platform-specific data format differences
Compression errors in newer MAT formats
Mixed version saves from different MATLAB versions

Solutions

Solution 1: Try Different MATLAB Loading Options

Adjust loading parameters to accommodate corruption:

In MATLAB, use the 'load' command with options:

% Try different MATLAB versions' formats
try
    % Try v7.3 format (HDF5-based)
    data = load('corrupt.mat', '-mat', '-v7.3');
catch
    try
        % Try v7 format
        data = load('corrupt.mat', '-mat', '-v7');
    catch
        try
            % Try v6 format (MATLAB 5.0)
            data = load('corrupt.mat', '-mat', '-v6');
        catch
            error('All loading attempts failed');
        end
    end
end

Try loading variables selectively to isolate corruption:

% List what variables are in the file
vars = who('-file', 'corrupt.mat');

% Try loading each variable separately
for i = 1:length(vars)
    try
        var_data = load('corrupt.mat', vars{i});
        fprintf('Successfully loaded: %s\n', vars{i});
    catch
        fprintf('Failed to load: %s\n', vars{i});
    end
end

Solution 2: Convert MAT File Versions

Transform between different MATLAB formats:

Load and re-save in a different format:

% Load whatever can be loaded
try
    data = load('corrupt.mat');
    
    % Save in older format which might be more robust
    save('recovered_v6.mat', '-struct', 'data', '-v6');
    
    % Or save in newer format
    save('recovered_v7.mat', '-struct', 'data', '-v7');
catch e
    fprintf('Error during conversion: %s\n', e.message);
end

For large files that might be using v7.3 (HDF5-based), try HDF5 tools:

% Use low-level HDF5 functions to access 7.3 format files
fileinfo = h5info('corrupt.mat');
datasets = {fileinfo.Datasets.Name};

% Extract datasets one by one
for i = 1:length(datasets)
    try
        data.(datasets{i}) = h5read('corrupt.mat', ['/' datasets{i}]);
        fprintf('Successfully extracted dataset: %s\n', datasets{i});
    catch
        fprintf('Failed to extract dataset: %s\n', datasets{i});
    end
end

% Save recovered data
save('recovered.mat', '-struct', 'data');

Solution 3: Use Third-Party Tools for MAT File Recovery

Leverage alternative libraries for loading MATLAB files:

Python example using scipy.io:

import scipy.io as sio
import h5py
import numpy as np

# Try loading with scipy
try:
    data = sio.loadmat('corrupt.mat')
    print("Successfully loaded with scipy.io")
    # Save back to a new mat file
    sio.savemat('recovered_scipy.mat', data)
except Exception as e:
    print(f"scipy.io failed: {str(e)}")
    
    # Try HDF5 approach for v7.3 files
    try:
        with h5py.File('corrupt.mat', 'r') as f:
            # Create a dictionary to hold the data
            data = {}
            
            # Function to recursively visit all objects
            def visit_and_extract(name, obj):
                if isinstance(obj, h5py.Dataset):
                    try:
                        # Convert to numpy array
                        data[name] = np.array(obj)
                        print(f"Extracted: {name}")
                    except Exception as e:
                        print(f"Failed to extract {name}: {str(e)}")
            
            # Visit all objects
            f.visititems(visit_and_extract)
            
            # Save recovered data with scipy
            if data:
                sio.savemat('recovered_h5py.mat', data)
                print("Saved recovered data")
    except Exception as e:
        print(f"HDF5 approach failed: {str(e)}")

Solution 4: Binary Analysis for Header Repair

For advanced users, fix file headers manually:

MATLAB MAT files have specific header structures depending on version:
- MAT 5.0 format starts with a 128-byte header
- The first 4 bytes should be 'MATLAB'
Use a hex editor to verify and potentially fix simple header corruption
For v7.3 files, use HDF5 header repair tools since they use HDF5 format

Solution 5: Partial Reconstruction from Research Results

When direct recovery fails, reconstruct critical data:

Check for exported figures or data that might contain the essential information
Look for script files that generated the data originally
Check for derivative files or analysis results that might contain copies of variables
If source data for calculations is available, rerun analyses to regenerate results

Error #4: "Jupyter Notebook Parse Error" or "Invalid Notebook Format"

Symptoms

When opening a Jupyter notebook (.ipynb file), you may encounter errors like "Notebook validation failed," "Invalid JSON," or "Unable to parse notebook." JupyterLab or Jupyter Notebook may fail to load the file, or display a corrupted version with missing cells or content.

Causes

Corrupted JSON structure
Interrupted save operations during kernel activity
Notebook server crashes during autosave
Merge conflicts in version control systems
Manual edits to the notebook file
JupyterLab/Notebook version incompatibilities
Extremely large output cells causing parsing issues

Solutions

Solution 1: Jupyter Notebook Format Validation and Repair

Check and fix JSON structure issues:

Use the nbformat validation tool:

jupyter nbconvert --to notebook --validate corrupted.ipynb

For more detailed diagnostics:

python -m nbformat.validator corrupted.ipynb

Try the notebook repair extension if available:

pip install nbrepair  # If available
jupyter nbrepair corrupted.ipynb

Solution 2: Fix JSON Structure Manually

Address specific JSON formatting issues:

Open the .ipynb file in a text editor (it's just JSON)
Look for obvious JSON errors:
- Missing or extra commas
- Unclosed brackets or braces
- Incomplete string values (missing quote marks)
Use an online JSON validator to identify specific syntax errors
Focus on fixing structural issues rather than content initially

Solution 3: Extract Cells and Content Programmatically

Recover individual notebook components:

Python script to extract salvageable cells:

import json
import nbformat

# Try to open the corrupted notebook
try:
    with open('corrupted.ipynb', 'r', encoding='utf-8') as f:
        content = f.read()
    
    # Try to parse the JSON, even if it's partially corrupted
    notebook_data = json.loads(content)
    
    # Extract cells
    cells = []
    if 'cells' in notebook_data:
        for i, cell in enumerate(notebook_data['cells']):
            try:
                # Validate each cell
                if 'cell_type' in cell and 'source' in cell:
                    cells.append(cell)
                    print(f"Successfully extracted cell {i}")
                else:
                    print(f"Skipping cell {i} due to missing required fields")
            except Exception as e:
                print(f"Error processing cell {i}: {str(e)}")
    
    # Create a new notebook with the salvageable cells
    new_notebook = nbformat.v4.new_notebook()
    new_notebook.cells = cells
    
    # If metadata is available, try to preserve it
    if 'metadata' in notebook_data:
        try:
            new_notebook.metadata = notebook_data['metadata']
        except:
            print("Could not recover metadata")
    
    # Write the repaired notebook
    with open('recovered.ipynb', 'w', encoding='utf-8') as f:
        nbformat.write(new_notebook, f)
    
    print(f"Recovered {len(cells)} cells to recovered.ipynb")
    
except Exception as e:
    print(f"Failed to recover notebook: {str(e)}")
    
    # If JSON parsing completely fails, try to extract content with regex
    import re
    try:
        with open('corrupted.ipynb', 'r', encoding='utf-8') as f:
            content = f.read()
        
        # Extract code blocks
        code_blocks = re.findall(r'"source":\s*\[(.*?)\]', content, re.DOTALL)
        
        # Create a simple text file with extracted code
        with open('extracted_code.txt', 'w', encoding='utf-8') as f:
            for i, block in enumerate(code_blocks):
                f.write(f"--- BLOCK {i} ---\n")
                # Remove JSON formatting
                cleaned = re.sub(r'",\s*"', '\n', block)
                cleaned = re.sub(r'"', '', cleaned)
                # Unescape newlines
                cleaned = cleaned.replace('\\n', '\n')
                f.write(cleaned)
                f.write('\n\n')
        
        print(f"Extracted {len(code_blocks)} code blocks to extracted_code.txt")
    
    except Exception as e2:
        print(f"Even basic content extraction failed: {str(e2)}")

Solution 4: Recover from Jupyter Autosave or Checkpoints

Look for automatic backups created by Jupyter:

Check for checkpoint files in the .ipynb_checkpoints directory:
```
ls -la .ipynb_checkpoints/
```

Restore from the checkpoint version:

cp .ipynb_checkpoints/notebook_name-checkpoint.ipynb recovered.ipynb

For JupyterLab, look for autosave files with names like:
```
ls -la ~/.jupyter/lab/workspaces/
```

Solution 5: Convert to Other Formats and Rebuild

Try conversion to simpler formats:

If the notebook partially opens, export to a different format:
```
jupyter nbconvert --to python corrupted.ipynb
```

For markdown content:

jupyter nbconvert --to markdown corrupted.ipynb

Create a new notebook and copy salvageable content from these exports
If output data is critical, try extracting just the HTML:
```
jupyter nbconvert --to html corrupted.ipynb
```

Error #5: "NumPy Array Loading Error" or "NPY Format Issue"

Symptoms

When trying to load NumPy binary files (.npy, .npz), you may encounter errors like "Unable to read array header," "Invalid NPY format," or "Cannot load NPZ file." The data may fail to load entirely, or load with incorrect shapes or data types.

Causes

Corrupted file headers
Incompatible NumPy versions
Endianness issues across different platforms
Incomplete file writes
Mixed data type corruption
Compression errors in NPZ files

Solutions

Solution 1: NumPy Loading with Error Handling

Try different loading approaches:

Python code with flexible loading options:

import numpy as np

def try_load_npy(filename):
    # Try different approaches to load a potentially corrupted NPY file
    try:
        # Standard approach
        data = np.load(filename)
        print("Standard loading successful")
        return data
    except Exception as e1:
        print(f"Standard loading failed: {str(e1)}")
        
        try:
            # Try with allow_pickle
            data = np.load(filename, allow_pickle=True)
            print("Loading with allow_pickle successful")
            return data
        except Exception as e2:
            print(f"allow_pickle loading failed: {str(e2)}")
            
            try:
                # Try with fixing
                data = np.load(filename, allow_pickle=True, fix_imports=True)
                print("Loading with fix_imports successful")
                return data
            except Exception as e3:
                print(f"fix_imports loading failed: {str(e3)}")
                
                try:
                    # Try with mmap_mode for large files
                    data = np.load(filename, mmap_mode='r')
                    print("Loading with mmap_mode successful")
                    return data
                except Exception as e4:
                    print(f"mmap_mode loading failed: {str(e4)}")
                    
                    # All attempts failed
                    print("All loading attempts failed")
                    return None

# For NPZ files
def try_load_npz(filename):
    try:
        # Standard approach
        data = np.load(filename)
        print("NPZ loading successful")
        print(f"Available arrays: {list(data.keys())}")
        return data
    except Exception as e:
        print(f"NPZ loading failed: {str(e)}")
        
        # Try opening as a zip file
        try:
            import zipfile
            with zipfile.ZipFile(filename) as z:
                print(f"NPZ file contains: {z.namelist()}")
                # Extract individual arrays
                arrays = {}
                for name in z.namelist():
                    if name.endswith('.npy'):
                        try:
                            with z.open(name) as f:
                                # Read the file into a BytesIO object
                                import io
                                data_bytes = io.BytesIO(f.read())
                                # Try to load the array
                                arr = np.load(data_bytes)
                                arrays[name[:-4]] = arr  # Remove .npy extension
                                print(f"Successfully extracted array: {name}")
                        except Exception as e2:
                            print(f"Failed to extract {name}: {str(e2)}")
                return arrays
        except Exception as e3:
            print(f"Zip extraction failed: {str(e3)}")
            return None

Solution 2: Repair NumPy File Headers

Fix header information in corrupted files:

Understanding the NPY format:
- NPY files start with a magic string ('\x93NUMPY')
- Followed by version byte, header length, and descriptor

Create a script to fix common header issues:

import numpy as np
import struct

def repair_npy_header(corrupted_file, repaired_file, expected_shape, dtype):
    """
    Attempt to repair a corrupted NPY file by reconstructing its header
    
    Parameters:
    corrupted_file - Path to the corrupted NPY file
    repaired_file - Where to save the repaired file
    expected_shape - Tuple with the expected array shape
    dtype - Expected data type (e.g., 'float32', 'int64')
    """
    try:
        # Read the raw data from the corrupted file
        with open(corrupted_file, 'rb') as f:
            content = f.read()
        
        # Check if the magic string is present
        if not content.startswith(b'\x93NUMPY'):
            print("Magic string missing, adding NPY header")
            
            # Determine the size of the data
            dtype_obj = np.dtype(dtype)
            header = {
                'descr': dtype_obj.str,
                'fortran_order': False,
                'shape': expected_shape
            }
            
            # Convert header to string representation
            header_str = repr(header).replace("'", '"')
            # Pad for 16-byte alignment
            header_bytes = header_str.encode('utf-8')
            padding = 16 - ((len(header_bytes) + 10) % 16)
            header_bytes = header_bytes + b' ' * padding + b'\n'
            # Format: 6-byte magic string + 4-byte header length + header
            magic = b'\x93NUMPY'
            version = struct.pack('BB', 1, 0)
            header_len = struct.pack('


                
                Solution 3: Extract Raw Data and Reconstruct
                For severe corruption, extract the raw binary data:
                
                    Skip the header and try to recover the raw data:
                        import numpy as np
import os

def extract_raw_data(corrupted_file, output_file, expected_shape, dtype):
    """
    Extract raw data from a corrupted NPY file, skipping the header
    """
    # Determine the data size
    dtype_obj = np.dtype(dtype)
    element_size = dtype_obj.itemsize
    total_elements = np.prod(expected_shape)
    expected_data_size = total_elements * element_size
    
    # Get file size
    file_size = os.path.getsize(corrupted_file)
    
    # Read the file
    with open(corrupted_file, 'rb') as f:
        # Skip potential header (NPY header is typically less than 128 bytes)
        header_size = min(128, file_size - expected_data_size)
        if header_size < 0:
            print("File too small for expected data size")
            return False
        
        f.seek(header_size)
        raw_data = f.read(expected_data_size)
    
    # Reshape the raw data into the expected array
    try:
        array = np.frombuffer(raw_data, dtype=dtype_obj)
        if len(array) == total_elements:
            array = array.reshape(expected_shape)
            # Save the reconstructed array
            np.save(output_file, array)
            print(f"Raw data extracted and saved to {output_file}")
            return True
        else:
            print(f"Extracted data size mismatch: got {len(array)}, expected {total_elements}")
            return False
    except Exception as e:
        print(f"Failed to reconstruct array: {str(e)}")
        return False
                    
                
                
                Solution 4: NPZ Archive Recovery
                For NPZ files (which are ZIP archives), use ZIP recovery:
                
                    Use ZIP utilities to check and extract contents:
                        import zipfile
import numpy as np
import io

def recover_npz(corrupted_npz, output_dir):
    """
    Try to recover individual NPY files from a corrupted NPZ archive
    """
    try:
        # Try to open as a ZIP file
        with zipfile.ZipFile(corrupted_npz, 'r') as z:
            file_list = z.namelist()
            print(f"NPZ archive contains: {file_list}")
            
            success_count = 0
            for name in file_list:
                if name.endswith('.npy'):
                    try:
                        # Extract the file
                        z.extract(name, output_dir)
                        print(f"Extracted {name} to {output_dir}")
                        
                        # Try to load it
                        arr = np.load(f"{output_dir}/{name}")
                        print(f"Successfully loaded {name}, shape: {arr.shape}, dtype: {arr.dtype}")
                        success_count += 1
                    except Exception as e:
                        print(f"Failed to process {name}: {str(e)}")
            
            print(f"Recovered {success_count} of {len(file_list)} files")
            return success_count > 0
    
    except zipfile.BadZipFile:
        print("File is not a valid ZIP/NPZ archive")
        
        # For severely corrupted ZIP files, try ZIP repair tools or raw extraction
        try:
            # Simple example - in practice, use specialized ZIP repair tools
            with open(corrupted_npz, 'rb') as f:
                data = f.read()
            
            # Look for NPY file signatures within the data
            npy_sigs = [b'\x93NUMPY']
            positions = []
            
            for sig in npy_sigs:
                pos = 0
                while True:
                    pos = data.find(sig, pos)
                    if pos == -1:
                        break
                    positions.append(pos)
                    pos += 1
            
            if positions:
                print(f"Found {len(positions)} potential NPY headers in the corrupted file")
                
                # Try to extract data starting from these positions
                for i, pos in enumerate(positions):
                    try:
                        # Extract a chunk of data (arbitrary size)
                        chunk = data[pos:pos+10000000]  # 10MB chunk
                        
                        # Try to load as NPY
                        with open(f"{output_dir}/recovered_{i}.npy", 'wb') as f:
                            f.write(chunk)
                        
                        # Test if loadable
                        try:
                            arr = np.load(f"{output_dir}/recovered_{i}.npy")
                            print(f"Successfully recovered array {i}, shape: {arr.shape}")
                        except:
                            print(f"Extracted chunk {i} is not a valid NPY file")
                    except Exception as e:
                        print(f"Failed to extract chunk {i}: {str(e)}")
                
                return True
            else:
                print("No NPY signatures found in the file")
                return False
        
        except Exception as e:
            print(f"Raw extraction failed: {str(e)}")
            return False
                    
                
                
                Solution 5: Alternative Storage Format Conversion
                When dealing with problematic NumPy binary files, convert to more robust formats:
                
                    If you can load the data, save in multiple formats for redundancy:
                        import numpy as np
import h5py
import pickle

def save_array_multi_format(array, base_filename):
    """
    Save an array in multiple formats for redundancy
    """
    # NumPy binary
    np.save(f"{base_filename}.npy", array)
    
    # Compressed NumPy
    np.savez_compressed(f"{base_filename}.npz", array=array)
    
    # HDF5 format
    with h5py.File(f"{base_filename}.h5", 'w') as f:
        f.create_dataset('array', data=array)
    
    # CSV (for 2D arrays)
    if array.ndim <= 2:
        np.savetxt(f"{base_filename}.csv", array, delimiter=',')
    
    # Python pickle
    with open(f"{base_filename}.pkl", 'wb') as f:
        pickle.dump(array, f)
    
    print(f"Saved array in multiple formats with base name: {base_filename}")



            
                Error #6: "Parquet/Arrow File Corruption" or "Columnar Data Access Issues"
                Symptoms
                When working with modern columnar storage formats like Parquet or Arrow, you may encounter errors like "Invalid Parquet file," "Footer corruption," or "Arrow metadata error." Only partial data may be accessible, or specific columns might be unreadable.
                
                Causes
                
                    File truncation during write operations
                    Corrupted file metadata or footers
                    Incompatible format versions
                    Compression-related errors
                    Schema inconsistencies or type violations
                    Library version incompatibilities
                
                
                Solutions
                Solution 1: Parquet Validation and Inspection
                Analyze the file structure to identify issues:
                
                    Use parquet-tools to examine the file:
                        parquet-tools meta corrupted.parquet
parquet-tools schema corrupted.parquet
                    
                    For detailed inspection:
                        parquet-tools dump corrupted.parquet
                    
                    Check for specific metadata or row group issues:
                        parquet-tools inspect corrupted.parquet
                    
                
                
                Solution 2: Selective Column and Row Group Reading
                Extract accessible portions of the data:
                
                    Python example using pyarrow:
                        import pyarrow.parquet as pq
import pandas as pd

def recover_parquet_by_columns(corrupted_file, output_file):
    """
    Attempt to recover a Parquet file by reading columns selectively
    """
    try:
        # Try to read the file metadata
        try:
            parquet_file = pq.ParquetFile(corrupted_file)
            schema = parquet_file.schema
            print(f"Successfully read schema with {len(schema.names)} columns")
            column_names = schema.names
        except Exception as e:
            print(f"Failed to read schema: {str(e)}")
            # Try a different approach to get column names
            try:
                # Try reading first row to get column names
                df_peek = pd.read_parquet(corrupted_file, nrows=1)
                column_names = df_peek.columns.tolist()
                print(f"Retrieved {len(column_names)} column names from first row")
            except:
                print("Cannot determine column names, recovery not possible")
                return False
        
        # Try reading each column individually
        recovered_columns = {}
        for col in column_names:
            try:
                # Read just this column
                column_data = pd.read_parquet(corrupted_file, columns=[col])
                recovered_columns[col] = column_data[col]
                print(f"Successfully recovered column: {col}")
            except Exception as e:
                print(f"Failed to recover column {col}: {str(e)}")
        
        # Combine recovered columns into a DataFrame
        if recovered_columns:
            recovered_df = pd.DataFrame(recovered_columns)
            print(f"Recovered DataFrame with {len(recovered_df)} rows and {len(recovered_columns)} columns")
            
            # Save the recovered data
            recovered_df.to_parquet(output_file)
            print(f"Saved recovered data to {output_file}")
            return True
        else:
            print("No columns could be recovered")
            return False
        
    except Exception as e:
        print(f"Overall recovery failed: {str(e)}")
        return False

def recover_parquet_by_row_groups(corrupted_file, output_file):
    """
    Attempt to recover a Parquet file by reading row groups selectively
    """
    try:
        # Try to open the file and get row group info
        parquet_file = pq.ParquetFile(corrupted_file)
        num_row_groups = parquet_file.num_row_groups
        print(f"File has {num_row_groups} row groups")
        
        # Try to read each row group
        dfs = []
        for i in range(num_row_groups):
            try:
                row_group = parquet_file.read_row_group(i)
                df = row_group.to_pandas()
                dfs.append(df)
                print(f"Successfully read row group {i} with {len(df)} rows")
            except Exception as e:
                print(f"Failed to read row group {i}: {str(e)}")
        
        # Combine the recovered row groups
        if dfs:
            recovered_df = pd.concat(dfs, ignore_index=True)
            print(f"Recovered DataFrame with {len(recovered_df)} rows and {len(recovered_df.columns)} columns")
            
            # Save the recovered data
            recovered_df.to_parquet(output_file)
            print(f"Saved recovered data to {output_file}")
            return True
        else:
            print("No row groups could be recovered")
            return False
            
    except Exception as e:
        print(f"Overall recovery failed: {str(e)}")
        return False
                    
                
                
                Solution 3: Format Conversion Recovery
                Convert between formats to bypass corruption:
                
                    Try different libraries and formats:
                        import pyarrow.parquet as pq
import pyarrow as pa
import pandas as pd

def multi_format_recovery(corrupted_file, base_output):
    """
    Try to recover data using multiple format conversions
    """
    recovery_methods = []
    
    # Method 1: PyArrow direct
    try:
        table = pq.read_table(corrupted_file)
        pa.parquet.write_table(table, f"{base_output}_pyarrow.parquet")
        recovery_methods.append("pyarrow_direct")
        print("PyArrow direct recovery successful")
    except Exception as e:
        print(f"PyArrow direct failed: {str(e)}")
    
    # Method 2: Via pandas
    try:
        df = pd.read_parquet(corrupted_file)
        df.to_parquet(f"{base_output}_pandas.parquet")
        recovery_methods.append("pandas_parquet")
        print("Pandas parquet recovery successful")
    except Exception as e:
        print(f"Pandas parquet failed: {str(e)}")
    
    # Method 3: Parquet to CSV to Parquet
    try:
        df = pd.read_parquet(corrupted_file)
        csv_path = f"{base_output}.csv"
        df.to_csv(csv_path, index=False)
        print(f"Saved to CSV: {csv_path}")
        
        # Read back from CSV
        df_csv = pd.read_csv(csv_path)
        df_csv.to_parquet(f"{base_output}_via_csv.parquet")
        recovery_methods.append("via_csv")
        print("CSV roundtrip recovery successful")
    except Exception as e:
        print(f"CSV roundtrip failed: {str(e)}")
    
    # Method 4: Convert to Arrow IPC format
    try:
        table = pq.read_table(corrupted_file)
        arrow_path = f"{base_output}.arrow"
        with pa.OSFile(arrow_path, 'wb') as sink:
            with pa.RecordBatchFileWriter(sink, table.schema) as writer:
                writer.write_table(table)
        
        # Read back from Arrow
        with pa.memory_map(arrow_path, 'rb') as source:
            reader = pa.RecordBatchFileReader(source)
            arrow_table = reader.read_all()
        
        pa.parquet.write_table(arrow_table, f"{base_output}_via_arrow.parquet")
        recovery_methods.append("via_arrow")
        print("Arrow IPC roundtrip successful")
    except Exception as e:
        print(f"Arrow IPC roundtrip failed: {str(e)}")
    
    # Summary
    if recovery_methods:
        print(f"Successfully recovered data using: {', '.join(recovery_methods)}")
        return True
    else:
        print("All recovery methods failed")
        return False
                    
                
                
                Solution 4: Repair Parquet Footer and Metadata
                For advanced users, fix file structure issues:
                
                    Understanding Parquet structure:
                        
                            Parquet files have a footer with metadata at the end
                            The last 8 bytes indicate the size of the footer
                            Corrupted footers often cause most recovery issues
                        
                    
                    Python example to fix truncated files (advanced):
                        import struct
import os
import pyarrow.parquet as pq

def repair_truncated_parquet(corrupted_file, fixed_file):
    """
    Attempt to repair a truncated Parquet file by reconstructing the footer
    Note: This is a simplified example and may not work for all cases
    """
    try:
        # First, make a copy of the corrupted file
        with open(corrupted_file, 'rb') as f_in, open(fixed_file, 'wb') as f_out:
            f_out.write(f_in.read())
        
        # Try to extract schema information from a similar file or first part of the file
        try:
            # This assumes that part of the file is valid and schema can be read
            partial_schema = pq.read_schema(corrupted_file)
            print(f"Retrieved partial schema with {len(partial_schema.names)} columns")
            
            # In a real implementation, you would now:
            # 1. Reconstruct proper row group metadata
            # 2. Recalculate column chunk offsets and sizes
            # 3. Build a new file footer with proper statistics
            # 4. Write the footer to the end of the file
            # 5. Append the footer length (4 bytes) and PARQ magic (4 bytes)
            
            print("Full footer reconstruction requires detailed Parquet format knowledge")
            print("Consider using specialized Parquet repair tools for serious corruption")
            
            return True
        except Exception as e:
            print(f"Schema extraction failed: {str(e)}")
            return False
            
    except Exception as e:
        print(f"Repair attempt failed: {str(e)}")
        return False
                    
                
                
                Solution 5: Use Specialized Arrow/Parquet Tools
                Leverage dedicated utilities for recovery:
                
                    For Arrow files, use the arrow-validate tool:
                        arrow-validate corrupted.arrow
                    
                    Consider commercial or specialized data recovery tools designed for columnar formats
                    Search for recovery utilities in the Apache Arrow and Parquet community resources
                
            

            
                Error #7: "Domain-Specific Format Errors" (FITS, PDB, etc.)
                Symptoms
                When working with specialized scientific formats like FITS (astronomy), PDB (molecular structures), or other domain-specific formats, you may encounter errors like "Invalid header," "Structure validation failed," or "Cannot parse format." The files may fail to load in specialized software, or display incorrectly.
                
                Causes
                
                    Format-specific structural corruption
                    Incompatible format versions or extensions
                    Missing required metadata or fields
                    Software version incompatibilities
                    File transfer or encoding issues
                    Domain-specific constraints violations
                
                
                Solutions
                Solution 1: FITS File Recovery (Astronomy)
                For corrupted FITS files used in astronomy:
                
                    Use FITS utilities to examine and fix the file:
                        # Check the file structure
fitsinfo corrupted.fits

# Verify the header
fitsdump -h corrupted.fits

# Try to fix common issues
fitsverify -e corrupted.fits
                    
                    Python example using astropy:
                        from astropy.io import fits
import numpy as np

def recover_fits(corrupted_file, output_file):
    """
    Attempt to recover data from a corrupted FITS file
    """
    try:
        # Try opening with various options
        try:
            hdul = fits.open(corrupted_file, ignore_missing_end=True)
            print("Successfully opened FITS file with ignore_missing_end")
        except Exception as e1:
            print(f"Standard open failed: {str(e1)}")
            try:
                hdul = fits.open(corrupted_file, ignore_missing_end=True, checksum=False)
                print("Successfully opened FITS file with checksum disabled")
            except Exception as e2:
                print(f"Checksum disabled open failed: {str(e2)}")
                return False
        
        # Process each HDU (Header Data Unit)
        salvaged_hdus = []
        for i, hdu in enumerate(hdul):
            try:
                # Check if header is readable
                header = hdu.header
                print(f"HDU {i} has readable header with {len(header)} keywords")
                
                # Check if data is accessible
                try:
                    data = hdu.data
                    if data is not None:
                        print(f"HDU {i} has data with shape {data.shape} and type {data.dtype}")
                        # Create a new HDU with the salvaged data
                        if isinstance(hdu, fits.PrimaryHDU):
                            new_hdu = fits.PrimaryHDU(data=data, header=header)
                        else:
                            new_hdu = fits.ImageHDU(data=data, header=header)
                        salvaged_hdus.append(new_hdu)
                    else:
                        print(f"HDU {i} has no data")
                        salvaged_hdus.append(fits.ImageHDU(header=header))
                except Exception as e:
                    print(f"Could not access data in HDU {i}: {str(e)}")
                    # Try to salvage just the header
                    salvaged_hdus.append(fits.ImageHDU(header=header))
            except Exception as e:
                print(f"Could not process HDU {i}: {str(e)}")
        
        # Create a new FITS file with salvaged HDUs
        if salvaged_hdus:
            new_hdul = fits.HDUList(salvaged_hdus)
            new_hdul.writeto(output_file, overwrite=True)
            print(f"Wrote {len(salvaged_hdus)} HDUs to {output_file}")
            return True
        else:
            print("No HDUs could be salvaged")
            return False
            
    except Exception as e:
        print(f"Overall recovery failed: {str(e)}")
        return False
                    
                
                
                Solution 2: PDB File Repair (Molecular Structures)
                For protein and molecular structure files:
                
                    Use structure validation tools:
                        pdb_validate corrupted.pdb
                    
                    Python example using Biopython:
                        from Bio import PDB
import re

def repair_pdb(corrupted_file, output_file):
    """
    Attempt to repair a corrupted PDB file
    """
    try:
        # Try using PDB parser with strict=False
        parser = PDB.PDBParser(QUIET=True, PERMISSIVE=True)
        try:
            structure = parser.get_structure('structure', corrupted_file)
            print("Successfully parsed PDB with permissive parser")
            
            # If successful, write to a new file
            io = PDB.PDBIO()
            io.set_structure(structure)
            io.save(output_file)
            print(f"Saved repaired structure to {output_file}")
            return True
        except Exception as e:
            print(f"Permissive parsing failed: {str(e)}")
        
        # If parsing fails completely, try line-by-line repair
        with open(corrupted_file, 'r') as f:
            lines = f.readlines()
        
        # Filter for valid ATOM/HETATM records
        valid_lines = []
        atom_pattern = re.compile(r'^(ATOM|HETATM)(\s*\d+\s+\w+\s+\w+\s+\w+\s+\d+\s+[-\d\.]+\s+[-\d\.]+\s+[-\d\.]+).*$')
        
        for line in lines:
            if line.startswith(('ATOM', 'HETATM')):
                match = atom_pattern.match(line)
                if match:
                    # This is a valid-looking ATOM/HETATM record
                    valid_lines.append(line)
            elif line.startswith(('TER', 'END', 'HEADER', 'TITLE', 'REMARK')):
                # Keep these administrative records
                valid_lines.append(line)
        
        if valid_lines:
            # Ensure we have END record
            if not any(line.startswith('END') for line in valid_lines):
                valid_lines.append('END\n')
            
            # Write the cleaned file
            with open(output_file, 'w') as f:
                f.writelines(valid_lines)
            
            print(f"Wrote {len(valid_lines)} valid records to {output_file}")
            
            # Try parsing again
            try:
                structure = parser.get_structure('fixed', output_file)
                print("Successfully parsed the repaired PDB file")
                return True
            except Exception as e:
                print(f"Parsing of repaired file still failed: {str(e)}")
                return False
                
        else:
            print("No valid ATOM/HETATM records found")
            return False
            
    except Exception as e:
        print(f"Overall repair attempt failed: {str(e)}")
        return False
                    
                
                
                Solution 3: General Approach for Domain-Specific Formats
                Apply these general principles to any specialized format:
                
                    Understand the file structure:
                        
                            Study the format specification if available
                            Identify critical header/metadata sections vs. data sections
                            Learn what validation constraints apply to the format
                        
                    
                    Use domain-specific validation tools:
                        
                            Most scientific domains have format-specific validators
                            Run with permissive options when available
                        
                    
                    Create a minimal valid file:
                        
                            Study examples of minimal valid files in the format
                            Compare headers and structures with your corrupted file
                            Sometimes combining a valid header with your data can work
                        
                    
                
                
                Solution 4: Format Conversion Recovery
                Use alternative formats when direct repair fails:
                
                    Identify common interchange formats in your scientific domain
                    If partial reading works, export to a simpler format:
                        
                            For structural data: Convert to simpler formats like mmCIF or SDF
                            For image data: Export to TIFF or other standard formats
                            For tabular data: Export to CSV or TSV
                        
                    
                    If raw data is crucial, extract the binary data blocks and rebuild
                
                
                Solution 5: Consult Domain Experts
                Seek specialized help for critical files:
                
                    Scientific domains often have mailing lists or forums for format issues
                    Contact the original software developers for recovery guidance
                    Consider professional data recovery services that specialize in scientific data
                
            

            
                Preventative Measures for Scientific Computing File Errors
                Taking proactive steps can significantly reduce the risk of scientific data file issues:
                
                
                    Regular File Validation: Use format-specific validation tools routinely
                    Multiple Format Storage: Save critical results in multiple file formats
                    Versioned Backups: Implement systematic backup procedures with versioning
                    Checksumming: Calculate and store file checksums with your data
                    Use Robust Storage Formats: Prefer formats with built-in validation (HDF5 with checksums, etc.)
                    Atomic File Operations: Use temporary files and atomic renames for safer saves
                    Metadata Documentation: Document data formats and structures separately
                    Version Control: Use Git LFS or similar for tracking data files
                    Automated Testing: Implement automated validation in data processing pipelines
                    Software Updates: Keep scientific libraries and tools current
                
            

            
                Best Practices for Scientific Data File Management
                Follow these best practices to minimize problems with scientific computing files:
                
                
                    Format Selection: Choose appropriate formats based on data characteristics and needs
                    Version Control Integration: Use Git LFS or DVC for large scientific datasets
                    Standardized Naming: Implement consistent file naming with version indicators
                    Metadata Management: Include comprehensive metadata within files
                    Data Publication Preparation: Validate files before submission to repositories
                    Documentation: Document data structures and dependencies
                    Format Conversion Testing: Verify round-trip conversions preserve data integrity
                    Dependency Management: Track software dependencies that affect file formats
                    Storage Media Selection: Use appropriate storage for different data lifecycle stages
                    Recovery Planning: Develop and test data recovery procedures in advance
                
            

            
                Scientific Computing File Repair Software and Tools
                Several specialized tools can help troubleshoot and repair scientific data files:
                
                
                    Format-Specific Tools:
                        
                            h5check, h5repack, h5dump (HDF5)
                            nccopy, nccheck, ncdump (NetCDF)
                            fitsverify, fitsfix (FITS)
                            pdb_validate, pdb_repair (PDB)
                            parquet-tools (Parquet)
                        
                    
                    Programming Libraries:
                        
                            h5py, PyTables (Python for HDF5)
                            netCDF4-python (Python for NetCDF)
                            astropy (Python for FITS)
                            Biopython, PyMOL (Molecular structures)
                            pyarrow (Arrow/Parquet)
                        
                    
                    General Data Analysis Tools:
                        
                            Pandas (Python data analysis)
                            NumPy (Array operations)
                            Jupyter Notebooks (Interactive analysis)
                        
                    
                    Domain-Specific Software:
                        
                            DS9, CASA (Astronomy)
                            VMD, PyMOL (Molecular visualization)
                            Climate Data Operators (CDO) (Climate science)
                        
                    
                    Low-Level Inspection Tools:
                        
                            hexdump, xxd (Hex editors)
                            strings (Text extraction)
                            file (File type identification)
                        
                    
                
                
                Having appropriate tools for your specific scientific domain is essential for effective troubleshooting and recovery.
            

            
                Advanced Considerations for High-Performance Computing Data
                For scientific data used in high-performance computing environments, consider these additional factors:
                
                Parallel File Access and Corruption
                
                    Parallel file systems like Lustre or GPFS introduce additional complexity
                    File striping across multiple storage targets can complicate recovery
                    Use parallel-aware tools and libraries (Parallel HDF5, Parallel NetCDF)
                    Implement proper locking mechanisms for concurrent access
                
                
                Big Data Considerations
                
                    For extremely large datasets (TB+), standard tools may be insufficient
                    Consider specialized big data repair approaches using distributed computing
                    Implement chunking strategies for manageable error isolation
                    Build redundancy into data storage from the beginning
                
                
                Long-term Data Preservation
                
                    Scientific data often needs to remain accessible for decades
                    Consider format obsolescence in long-term archiving strategies
                    Document recovery procedures with the archived data
                    Include sample code for reading/interpreting the data
                    Store multiple representation formats when possible
                
            

            
                Conclusion
                Scientific computing file errors present unique challenges due to the specialized formats, complex data structures, and high value of research data. Whether dealing with HDF5 corruption, NetCDF dimension issues, or domain-specific format problems, a methodical approach to troubleshooting and recovery is essential to preserve valuable scientific information.
                
                Prevention is the most effective strategy, and implementing good scientific data management practices—including format selection, validation, backup procedures, and documentation—can significantly reduce the likelihood of encountering serious file issues. When problems do arise, approach them systematically, starting with format-specific validation and using the appropriate specialized tools for your scientific domain.
                
                By following the guidance in this article and utilizing appropriate tools, researchers and data scientists should be well-equipped to handle most scientific computing file errors they may encounter, ensuring that valuable research data remains accessible and usable for analysis and reproducibility.
            

            
                Related Articles
                
                    Dealing with Corrupted Files: Recovery Techniques
                    Large File Issues: Handling and Optimization
                    Medical Imaging File Errors: Troubleshooting & Recovery
                    Geographic Information System (GIS) File Errors
                    Database Backup and Recovery Errors