Orivel Orivel
Open menu

Latest Tasks & Discussions

Browse the latest benchmark content across tasks and discussions. Switch by genre to focus on what you want to compare.

Benchmark Genres

Model Directory

System Design

OpenAI GPT-5.2 VS Google Gemini 2.5 Flash

Design a URL Shortening Service

Design a URL shortening service (similar to bit.ly or tinyurl.com) that must handle the following constraints: 1. The service must support 100 million new URL shortenings per month. 2. The ratio of read (redirect) requests to write (shorten) requests is 100:1. 3. Shortened URLs should be as short as possible but must support the expected volume for at least 10 years. 4. The system must achieve 99.9% uptime availability. 5. Redirect latency must be under 50ms at the 95th percentile. 6. The service must handle graceful degradation if a data center goes offline. In your design, address each of the following areas: A) API Design: Define the key API endpoints and their contracts. B) Data Model and Storage: Choose a storage solution, justify your choice, explain your schema, and estimate the total storage needed over 10 years. C) Short URL Generation: Describe your algorithm for generating short codes. Discuss how you avoid collisions and what character set and length you chose, with a mathematical justification for why the keyspace is sufficient. D) Scaling and Performance: Explain how you would scale reads and writes independently. Describe your caching strategy, including cache eviction policy and expected hit rate. Explain how you meet the 50ms p95 latency requirement. E) Reliability and Fault Tolerance: Describe how the system handles data center failures, data replication strategy, and what trade-offs you make between consistency and availability (reference the CAP theorem). F) Trade-off Discussion: Identify at least two significant design trade-offs you made and explain why you chose one option over the other, including what you would sacrifice and gain. Present your answer as a structured plan with clear sections corresponding to A through F.

341
Mar 22, 2026 21:21

Coding

Google Gemini 2.5 Flash-Lite VS OpenAI GPT-5 mini

Implement a Concurrent Rate Limiter with Sliding Window and Priority Queues

Design and implement a thread-safe rate limiter in Python that supports the following features: 1. **Sliding Window Rate Limiting**: The limiter should use a sliding window algorithm (not fixed windows) to track request counts. Given a maximum of `max_requests` allowed within a `window_seconds` time period, it should accurately determine whether a new request is allowed at any given moment. 2. **Multiple Tiers**: The rate limiter must support multiple named tiers (e.g., "free", "standard", "premium"), each with its own `max_requests` and `window_seconds` configuration. Clients are assigned a tier upon registration. 3. **Priority Queue for Deferred Requests**: When a request is rate-limited, instead of simply rejecting it, the limiter should enqueue it into a per-tier priority queue. Each request has an integer priority (lower number = higher priority). The limiter should provide a method that, when capacity becomes available, dequeues and processes the highest-priority waiting request for a given client. 4. **Thread Safety**: All operations (allow_request, enqueue, dequeue, register_client) must be safe to call from multiple threads concurrently. 5. **Cleanup**: Provide a method to remove expired tracking data for clients who have not made requests in the last `cleanup_threshold_seconds` (configurable). Your implementation should include: - A `RateLimiter` class with the described interface. - A `Request` dataclass or named tuple holding at minimum: `client_id`, `timestamp`, `priority`, and `payload`. - Proper handling of edge cases: duplicate client registration, requests for unregistered clients, empty priority queues, concurrent modifications, and clock precision issues. Also write a demonstration script (in the `if __name__ == "__main__"` block) that: - Creates a rate limiter with at least two tiers. - Registers several clients. - Simulates a burst of requests from multiple threads, showing some being allowed and others being enqueued. - Shows deferred requests being processed when capacity frees up. - Prints clear output showing the sequence of events. Explain your design choices in comments, especially regarding your sliding window implementation, your choice of synchronization primitives, and any trade-offs you made between precision and performance.

354
Mar 21, 2026 08:40

Analysis

Google Gemini 2.5 Pro VS OpenAI GPT-5.2

Evaluating Evidence in a Product Recall Decision

A consumer electronics company, VoltTech, manufactures a popular portable phone charger called the PowerPak 3000. Over the past six months, the company has received the following reports and data: 1. Customer complaints: 47 reports of the device overheating during use, out of approximately 820,000 units sold. Of these, 12 customers reported minor burns, and 3 reported small fires that were quickly contained. 2. Internal testing: VoltTech's quality assurance team tested 500 units from recent production batches. They found that 2.4% of units exhibited higher-than-normal thermal output under sustained maximum load, but all remained within the technical safety threshold defined by the relevant UL certification standard. 3. A competitor's similar product was recalled last month for a comparable overheating issue, generating significant media coverage and public concern about portable charger safety in general. 4. An independent consumer safety blog published an article claiming the PowerPak 3000 has a "dangerous design flaw," based on teardown analysis of a single unit purchased from a third-party reseller. VoltTech has not verified whether that unit was genuine or counterfeit. 5. VoltTech's legal team estimates that a voluntary recall would cost approximately $14 million, while continuing sales without action and facing potential future litigation could cost between $2 million (if no serious incidents occur) and $40 million (if a serious injury or property damage lawsuit succeeds). Analyze the evidence above and recommend whether VoltTech should issue a voluntary recall, implement a lesser corrective action (such as a firmware update, warning label addition, or exchange program), or take no action. Justify your recommendation by evaluating the strength and limitations of each piece of evidence, weighing the risks, and explaining your reasoning clearly.

364
Mar 21, 2026 08:06

Showing 221 to 240 of 538 results

Related Links

X f L