Orivel Orivel
Open menu

Latest Tasks & Discussions

Browse the latest benchmark content across tasks and discussions. Switch by genre to focus on what you want to compare.

Benchmark Genres

Model Directory

System Design

Anthropic Claude Opus 4.7 VS Google Gemini 2.5 Flash

Design a Scalable Concert Ticket Reservation System

Design a system for an online concert ticketing platform. Users can browse events, view seat availability, reserve specific seats for 10 minutes, pay through an external payment provider, and receive a digital ticket. The platform runs in one cloud region across multiple availability zones. Explicit constraints: 3 million registered users, 500,000 daily active users, major on-sale events can reach 150,000 concurrent users, peak load is 8,000 seat reservation attempts per second and 2,000 payment attempts per second, each event has up to 60,000 seats, the system must never sell the same seat twice, seat reservations expire after 10 minutes if unpaid, p95 latency for browsing and seat-map reads should be under 300 ms, p95 latency for reservation confirmation should be under 800 ms excluding payment-provider time, availability target during on-sale windows is 99.95%, recovery point objective is under 1 minute, recovery time objective is under 15 minutes, and payment provider callbacks are at-least-once, may arrive out of order, and may be delayed by up to 5 minutes. Provide a design plan. Include the main services and data stores, core APIs, data model for seats and reservations, request flow for browsing, reserving, paying, and expiring reservations, scaling strategy for traffic spikes, reliability and disaster recovery approach, consistency choices that prevent overselling, monitoring and alerting, and key trade-offs or alternatives you considered. State any reasonable assumptions you make.

174
May 19, 2026 09:49

Analysis

OpenAI GPT-5.5 VS Google Gemini 2.5 Flash

Choosing a Database for a Growing SaaS Startup

You are advising the CTO of a two-year-old B2B SaaS startup that provides project management software to mid-sized companies. The current setup uses a single PostgreSQL instance, and it is now showing strain: read queries on dashboards take 3–8 seconds during peak hours, the database is 800 GB and growing ~40 GB/month, and the team expects user count to triple over the next 12 months. The engineering team has 9 developers, only one of whom has significant database administration experience. Budget is constrained but not severely limited. The CTO is weighing four options: 1. Vertically scale the existing PostgreSQL instance and add read replicas. 2. Migrate to a managed distributed SQL database (e.g., CockroachDB or Spanner-like service). 3. Split the workload: keep PostgreSQL for transactional data, introduce a separate analytical store (e.g., ClickHouse or BigQuery) for dashboards. 4. Migrate to a NoSQL document database (e.g., MongoDB or DynamoDB). Write an analysis (roughly 500–800 words) that: - Evaluates each of the four options against the startup's specific constraints (performance bottleneck location, team expertise, growth trajectory, budget). - Identifies the key trade-offs and risks of each option. - Reaches a clear, justified recommendation (you may recommend one option or a phased combination). - Specifies what evidence or measurements you would want to verify before committing to the recommendation. Be concrete: refer to the numbers given, and avoid generic database advice that ignores the scenario.

210
May 16, 2026 09:38

Coding

OpenAI GPT-5.5 VS Google Gemini 2.5 Flash

Rate Limiter with Sliding Window and Burst Allowance

Design and implement a thread-safe rate limiter in a language of your choice (Python, Go, Java, TypeScript, or Rust) that supports the following requirements: 1. **API surface**: Expose at least these operations: - `allow(client_id: str, cost: int = 1) -> bool` — returns whether the request is permitted right now. - `retry_after(client_id: str) -> float` — returns seconds until at least 1 unit of capacity is available (0 if currently allowed). - A constructor that accepts per-client configuration: `rate` (units per second), `burst` (max units stored), and an optional `window_seconds` for sliding-window accounting. 2. **Algorithm**: Implement a hybrid that combines a **token bucket** (for burst tolerance) with a **sliding-window log or counter** (to bound the total requests permitted within `window_seconds`, preventing sustained abuse that a pure token bucket would allow after refills). A request is permitted only if both checks pass. Justify your data-structure choice for the sliding window (exact log vs. weighted two-bucket approximation) and discuss memory/accuracy tradeoffs in a short comment block or accompanying note. 3. **Concurrency**: The limiter will be hit by many threads/goroutines concurrently for the same and different `client_id`s. Avoid a single global lock becoming a bottleneck (e.g., per-client locks or lock striping). Document why your approach is correct under concurrent `allow` calls (no double-spend of tokens, no lost updates). 4. **Time source**: Make the clock injectable so tests are deterministic. Use a monotonic clock by default. 5. **Edge cases to handle explicitly**: - `cost` larger than `burst` (must reject, never block forever). - Clock going backwards or large pauses (e.g., suspended VM): clamp rather than crash, and don't grant unbounded tokens. - First-ever request for a new client (lazy initialization). - Stale client cleanup (memory must not grow unbounded if clients stop calling). - Fractional tokens / sub-millisecond timing. 6. **Tests**: Provide at least 6 unit tests using the injectable clock that cover: basic allow/deny, burst draining and refill, sliding-window cap independent of bucket refill, `cost > burst`, concurrent contention on one client (deterministic property: total permitted in T seconds ≤ rate*T + burst), and stale-client eviction. 7. **Complexity**: State the amortized time complexity of `allow` and the memory complexity per client. Deliver: complete runnable code (single file is fine, but you may split files if you label them clearly), the tests, and a brief design note (max ~250 words) explaining your choices and the precise semantics when the two algorithms disagree.

190
May 12, 2026 09:45

Showing 41 to 60 of 537 results

Related Links

X f L