Answer A: Anthropic Claude Haiku 4.5
#!/usr/bin/env python3 import argparse import os import shutil import hashlib from datetime import datetime from pathlib import Path def get_file_md5(file_path): """Calculate MD5 hash of a file.""" hash_md5 = hashlib.md5() try: with open(file_path, 'rb') as f: for chunk in iter(lambda: f.read(4096), b''): hash_md5.update(chunk) return hash_md5.hexdigest() except Exception as e: log(f"ERROR: Failed to calculate MD5 for {file_path}: {e}") return None def log(message): """Log message to console and log file.""" timestamp = datetime...
Show Full Answer ▼
#!/usr/bin/env python3 import argparse import os import shutil import hashlib from datetime import datetime from pathlib import Path def get_file_md5(file_path): """Calculate MD5 hash of a file.""" hash_md5 = hashlib.md5() try: with open(file_path, 'rb') as f: for chunk in iter(lambda: f.read(4096), b''): hash_md5.update(chunk) return hash_md5.hexdigest() except Exception as e: log(f"ERROR: Failed to calculate MD5 for {file_path}: {e}") return None def log(message): """Log message to console and log file.""" timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S") log_message = f"[{timestamp}] {message}" print(log_message) if log_file_path: try: with open(log_file_path, 'a') as f: f.write(log_message + '\n') except Exception as e: print(f"ERROR: Failed to write to log file: {e}") def sync_directories(source, replica): """Perform one-way synchronization from source to replica.""" source_path = Path(source) replica_path = Path(replica) if not source_path.exists(): log(f"ERROR: Source path does not exist: {source}") return if not replica_path.exists(): log(f"Creating replica directory: {replica}") replica_path.mkdir(parents=True, exist_ok=True) # Get all items in source and replica source_items = {} replica_items = {} # Build source items dictionary for root, dirs, files in os.walk(source_path): for file in files: file_path = Path(root) / file rel_path = file_path.relative_to(source_path) source_items[str(rel_path)] = ('file', file_path) for dir_name in dirs: dir_path = Path(root) / dir_name rel_path = dir_path.relative_to(source_path) source_items[str(rel_path)] = ('dir', dir_path) # Build replica items dictionary for root, dirs, files in os.walk(replica_path): for file in files: file_path = Path(root) / file rel_path = file_path.relative_to(replica_path) replica_items[str(rel_path)] = ('file', file_path) for dir_name in dirs: dir_path = Path(root) / dir_name rel_path = dir_path.relative_to(replica_path) replica_items[str(rel_path)] = ('dir', dir_path) # Copy/update files from source to replica for rel_path, (item_type, source_item_path) in source_items.items(): replica_item_path = replica_path / rel_path if item_type == 'file': if rel_path not in replica_items: # File doesn't exist in replica, copy it replica_item_path.parent.mkdir(parents=True, exist_ok=True) shutil.copy2(source_item_path, replica_item_path) log(f"COPY {rel_path}") else: # File exists in replica, check if content is different source_md5 = get_file_md5(source_item_path) replica_md5 = get_file_md5(replica_item_path) if source_md5 and replica_md5 and source_md5 != replica_md5: shutil.copy2(source_item_path, replica_item_path) log(f"UPDATE {rel_path}") elif item_type == 'dir': if rel_path not in replica_items: # Directory doesn't exist in replica, create it replica_item_path.mkdir(parents=True, exist_ok=True) log(f"COPY {rel_path}") # Remove files/directories from replica that don't exist in source items_to_remove = [] for rel_path, (item_type, replica_item_path) in replica_items.items(): if rel_path not in source_items: items_to_remove.append((rel_path, item_type, replica_item_path)) # Sort by path depth (deepest first) to avoid issues with removing parent before children items_to_remove.sort(key=lambda x: x[0].count(os.sep), reverse=True) for rel_path, item_type, replica_item_path in items_to_remove: try: if item_type == 'file': replica_item_path.unlink() log(f"REMOVE {rel_path}") elif item_type == 'dir': shutil.rmtree(replica_item_path) log(f"REMOVE {rel_path}") except Exception as e: log(f"ERROR: Failed to remove {rel_path}: {e}") log("Synchronization completed successfully") def main(): parser = argparse.ArgumentParser( description='One-way file synchronization tool' ) parser.add_argument( 'source_path', help='Path to the source directory' ) parser.add_argument( 'replica_path', help='Path to the replica directory' ) parser.add_argument( 'log_file_path', help='Path to the log file' ) args = parser.parse_args() global log_file_path log_file_path = args.log_file_path # Clear or create log file try: with open(log_file_path, 'w') as f: f.write('') except Exception as e: print(f"ERROR: Failed to initialize log file: {e}") return log("Starting file synchronization") log(f"Source: {args.source_path}") log(f"Replica: {args.replica_path}") sync_directories(args.source_path, args.replica_path) if __name__ == '__main__': main()
Result
Winning Votes
0 / 3
Average Score
Total Score
Overall Comments
Implements one-way sync with MD5-based updates and logs to console/file with timestamps. However, it has notable correctness/robustness gaps: it does not handle type conflicts (e.g., source file vs replica directory) safely, its removal step can double-delete (it collects both a directory and its children, then rmtree on the directory can cause later failures), and logging relies on a global variable. It also computes MD5 for both sides on every existing file without any quick precheck, which can be expensive.
View Score Details ▼
Correctness
Weight 35%Correct for many basic cases (copy missing, update via MD5, delete extras), but breaks/errs in common edge cases: file/dir type mismatches are not handled; deletion pass may attempt to remove children after rmtree of parent (causing errors and potentially leaving inconsistent state if failures occur).
Completeness
Weight 20%Covers required arguments, one-way sync, MD5 comparison, nested/empty directories, and timestamped logging. Missing key safeguards (source/replica relationship checks) and lacks conflict handling; logging operations sometimes use relative paths inconsistently (directories logged as COPY but without clarity).
Code Quality
Weight 20%Readable but relies on a global for log_file_path, mixes concerns (logging and sync), and uses duplicated os.walk dictionary building. Error handling is ad-hoc and can mask real failures.
Practical Value
Weight 15%Usable for simple scenarios, but fragility around type conflicts and deletion ordering reduces suitability for automated server backups. Efficiency suffers due to hashing both files for every comparison without a quick size precheck.
Instruction Following
Weight 10%Uses argparse, Python 3, one-way sync, MD5-based detection, logs with timestamps to console and file, and runs once. Minor deviations: directory log entries use COPY label and global-based logging is brittle.
Total Score
Overall Comments
Answer A is a functional, readable synchronization script that covers the core requirements. It uses a global variable for the log file path (a code smell), implements a custom logging function instead of the standard `logging` module, and uses a 4096-byte chunk size for MD5 hashing (less efficient for large files). The removal logic sorts by depth which is correct, but the approach of building two full dictionaries via os.walk and then comparing them is slightly less elegant. Error handling is present but basic. The script clears the log file on each run, which may not be desirable. Overall it is solid but has several minor design weaknesses.
View Score Details ▼
Correctness
Weight 35%Handles the main sync cases correctly: copy new files/dirs, update changed files via MD5, remove extras sorted deepest-first. Does not handle the edge case where a path exists as a file in replica but as a directory in source (or vice versa), and does not handle symlinks. The removal of a directory with shutil.rmtree could remove children that were already individually scheduled for removal, but the depth-sort mitigates this. Overall correct for standard cases.
Completeness
Weight 20%Covers all required features: argparse, MD5 hashing, logging to console and file, one-way sync, nested directories, empty directories. Missing: no path validation beyond existence check, no handling of type conflicts, no expanduser, clears log on each run without mention.
Code Quality
Weight 20%Uses a global variable for log_file_path which is a code smell. Custom logging function instead of the standard logging module. Functions are reasonably named and structured. No type hints. 4096-byte chunk size is suboptimal for large files. Overall readable but not up to professional standards.
Practical Value
Weight 15%Works correctly for typical use cases. Clears the log file on each run (may lose history). No path overlap validation. No expanduser. Suitable for basic use but lacks robustness for production server use.
Instruction Following
Weight 10%Follows all explicit instructions: argparse with three positional args, MD5 hashing, logging to console and file with timestamps, one-way sync, single run (no loop), handles nested/empty dirs. Minor deviation: clears log file on each run rather than appending.
Total Score
Overall Comments
Answer A provides a functional script that meets the basic requirements of the prompt. It correctly uses `argparse` and implements the core synchronization logic using `os.walk`. However, it suffers from several significant weaknesses: it has a correctness bug where it fails to handle file/directory type mismatches, it lacks crucial robustness checks like validating input paths to prevent dangerous operations (e.g., syncing a directory into itself), and its code quality is diminished by the use of a global variable and a custom logging function instead of Python's standard `logging` module.
View Score Details ▼
Correctness
Weight 35%The script's logic fails to handle the edge case where a path exists as a file in the source and a directory in the replica (or vice-versa). In this scenario, it would attempt to calculate an MD5 hash on a directory, causing an error. The core logic for simple cases is present, but this flaw reduces its correctness score.
Completeness
Weight 20%The answer fulfills the main requirements listed in the prompt, such as using `argparse`, MD5 hashing, and logging. However, it lacks features expected of a complete command-line tool, most notably input validation, which makes it less of a finished product.
Code Quality
Weight 20%The code is functional and reasonably structured into functions. However, the use of a global variable for the log file path is a significant design flaw that reduces modularity. Additionally, it implements a custom logging function instead of leveraging Python's powerful and standard `logging` module.
Practical Value
Weight 15%The script has limited practical value for its intended purpose as a server backup tool due to its lack of robustness. Without input path validation, a simple user error could lead to data loss or unintended behavior. The correctness bug also makes it unreliable.
Instruction Following
Weight 10%The answer follows all instructions very well. It uses the specified libraries and implements all requested features. The log message format ("COPY rel_path") perfectly matches the example given in the prompt.