BIG BOX Hosting Services IP Warmup № 02.04

Reputation, built
not faked.

A 4 to 6-week structured ramp from cold IP to production-ready inbox placement, run by humans who watch the metrics every morning. Real recipient engagement only, segmented by recency, paced against the current Gmail and Microsoft thresholds. No warmup-network tools, no fake opens, no shortcuts that get you delisted in week three.

01  /  What this is

The first ten days decide the next twelve months.

IP warmup is the single engagement where small mistakes cost the most. The reputation built in the first two weeks of a new dedicated IP — for better or worse — sets the placement floor for everything that follows. There are very few opportunities to fix it after the fact, and the ones that exist are slow and expensive.

An IP that has never sent mail before has no reputation at any of the major mailbox providers. Gmail does not know whether you are a serious B2B sender or a Belarussian spam farm; Microsoft cannot tell whether your traffic profile is legitimate or whether you are about to burn the address. The default response is suspicion. Send 10,000 messages from a fresh IP on day one and Gmail will defer most of them, Microsoft will route them to junk, and Yahoo will quietly throttle the connection to a crawl. Inside a week, the IP carries a negative reputation that takes 30 to 60 days of perfect sending behaviour to reverse — assuming the IP has not been added to a blacklist in the meantime.

The mechanics of warmup are well-documented and rigorously enforced. Mailbox providers track sending reputation on a rolling 30-day window. The first two weeks of activity matter more than the next four because the early signal-to-noise ratio is highest — a tiny volume of high-engagement mail is what tells Gmail the IP belongs to a sender who knows what they are doing. The opposite signal — high volume early, mediocre engagement, scattered traffic across receiving domains — is the spammer profile, and the algorithms are tuned to recognise it.

The non-negotiable inputs are list quality plus authentication plus engagement segmentation. Bounces above 2% in week one mean your list is dirty, and the warmup will fail no matter how careful the schedule is. SPF authentication plus DKIM signing plus DMARC alignment must be live and passing before the first send — these are no longer optional under the February 2024 Google/Yahoo bulk sender rules and the May 2025 Microsoft enforcement that followed. And the list segment hitting the new IP in weeks one and two must be your most recently engaged subscribers — people who opened or clicked in the last 30 days. Anything older introduces the kind of soft-fail noise that kills early reputation.

What we do is run all of this with discipline and continuity. The same engineer watches the IP every morning of the warmup. The schedule is paced against the actual placement metrics coming back through Google Postmaster Tools and Microsoft SNDS, not against a calendar. When numbers wobble — a single day of elevated complaints, a transient deferral pattern at Yahoo — we hold the schedule for 24 to 48 hours and let the reputation catch up before scaling further. The total elapsed time is typically 28 to 42 days. The total engineering time is roughly two hours per IP per week.

02  /  The schedule we actually run

A six-week curve, tuned daily.

The schedule below is a starting point for senders targeting 1M sends/day at full volume. Lower targets compress the timeline; higher targets extend it. We adjust against the live metrics every morning rather than treating the schedule as a fixed plan.

The schedule is built around three principles. Engaged segments first, expanding outward as reputation builds. Volume increases by no more than 2× per day, never on a Friday or before a public holiday in the recipient cohort. ISPs warmed in parallel rather than sequentially — same daily volume distributed across Gmail plus Microsoft plus Yahoo plus Apple proportional to the eventual production mix. Warming up Gmail first and adding Microsoft in week three is one of the most common mistakes; the result is great Gmail placement and terrible Outlook placement when full volume eventually lands.

Week-by-week ramp toward 1M/day production volumeconservative
# Engaged cohort = opened or clicked in last 30 / 60 / 90 days
# Volumes are TOTAL across all receiving domains, distributed
# proportional to your production traffic mix.

Week 1  # 30-day engaged only. Hard floor.
  Day 1:    500      Day 2:   1,000     Day 3:   2,000
  Day 4:    4,000    Day 5:   8,000     Day 6:  12,000
  Day 7:   16,000    # Pause and review. Fri/Sat/Sun if needed.

Week 2  # 30-day engaged only. Continue.
  Day 8:   25,000    Day 9:  35,000     Day 10: 50,000
  Day 11:  70,000    Day 12: 90,000     Day 13:110,000
  Day 14: 130,000    # End of critical window. Reputation now established.

Week 3  # Expand to 60-day engaged. Add in 15% chunks.
  Day 15: 160,000    Day 16: 200,000    Day 17: 240,000
  Day 18: 280,000    Day 19: 320,000    Day 20: 360,000
  Day 21: 400,000

Week 4  # Continue expansion. 60-day engaged, full pool.
  Day 22: 450,000    Day 23: 500,000    Day 24: 560,000
  Day 25: 620,000    Day 26: 680,000    Day 27: 740,000
  Day 28: 800,000

Week 5-6  # 90-day engaged becomes safe. Full target volume.
  Days 29-42: ramp to 1,000,000/day at +5% per day,
              hold for 7 consecutive days at target before
              calling the warmup complete.

The schedule is the public version. The private version, which we run against live data, has 22 additional decision points where we either hold steady or accelerate or fall back depending on what we see in Gmail Postmaster Tools, Microsoft SNDS, and the per-IP placement results from the seed-list testing. The decisions look something like this.

Daily decision matrixexcerpt
# Read each morning at 09:00 Europe/Ljubljana, applied to the day's send.

if complaint_rate > 0.30% over_24h:
    action: pause_24h, investigate_segment, halve_volume_on_resume

elif complaint_rate > 0.15% over_24h:
    action: hold_volume_24h, monitor_closely

if bounce_rate > 2.0% over_24h:
    action: pause, re-verify_remaining_list, do_not_resume_until_clean

if gmail_postmaster_reputation == "Bad":
    action: pause_72h, investigate_engagement_drop

if microsoft_snds_filter_result == "Yellow":
    action: hold_microsoft_volume, increase_engaged_segment_share

if all_metrics_green for 3 consecutive_days:
    action: proceed_to_next_step_in_schedule

The thing the schedule does not show is the cohort work. The 30-day engaged segment in week one is not just "people who opened in the last 30 days" — it is the most active subset of that group, prioritised by recency-frequency-monetary scoring where it is available, by recency-clicks where it is not. Apple Mail Privacy Protection has made open rates an unreliable signal since 2021, so we lean on click events plus reply events plus conversion events as the primary engagement indicators. Subject lines in week one are deliberately plain — no aggressive promotional copy, no urgency triggers, no spam-shaped patterns that would amplify any borderline content filtering.

Here's the thing: most warmup failures aren't schedule problems — they're list quality problems. We've seen teams obsess over whether to send 200 or 500 on Day 1 while sitting on a list with 8% invalid addresses. Fix the data first. The schedule is forgiving; bad data isn't. — Operations note, internal warmup runbook v6
─────────────────────────────────────────────────────────────────────────
02b  /  What the curve actually looks like

The plan vs what really happens.

Three real client warmups from the last quarter, plotted against the planned 42-day curve. Names anonymised, numbers preserved. The shape of the recovery on Client C — when Yahoo started deferring on day 11 — is the part nobody publishes because it requires admitting warmup is feedback-driven, not deterministic.

The schedule above is the plan. The chart below is what actually happens. We pulled three real warmups from the last quarter — anonymised by client, identified only by send profile — and overlaid the planned daily volume against the actual achieved volume. Two of the three tracked the plan within 5 percent. The third hit a wall on day 11 when Yahoo started deferring 30 percent of injected messages, and the curve shows the recovery — three days at reduced volume, then resumption on the planned trajectory by day 16. The shape of the recovery is what most warmup vendors do not publish, because it requires admitting that warmup is not deterministic.

Three datasets on the chart. The dotted teal line is the planned curve from our standard 42-day playbook — starts at 50 messages on day one, doubles every two days through day 14, transitions to multiplicative scaling through day 28, then linear ramp to full volume by day 42. The solid lines are three actual client warmups. Client A tracked tightly. Client B underran the plan slightly because their list quality was better than typical and we accelerated cautiously. Client C hit the Yahoo wall on day 11. The dip on the Client C curve between days 11 and 14 is the recovery period — we cut volume by 60 percent across all receivers for 72 hours while reputation re-stabilised, then resumed. The interesting comparison is between Client B's smooth ramp and Client C's recovery — both ended up at full volume on schedule, but the path to get there was substantially different. Warmup is not a fixed sequence. It is a planned trajectory with feedback corrections.

// daily message volume across 42-day warmup · planned vs actual

Methodology: actual achieved daily volume from three production warmups in Q4 2025, normalised to the same target volume of 200,000 messages/day at full ramp. Planned curve is our standard 42-day playbook. Client A: B2B SaaS, 80k list. Client B: e-commerce transactional, 150k list, lower complaint history. Client C: B2C marketing, 200k list, hit Yahoo deferral on day 11 due to early-warmup engagement signal misread. Recovery applied per playbook section 4.3 (volume reduction, observation window, graduated resumption).

Two takeaways for buyers comparing warmup services. First, the only honest warmup vendor is one that shows you what failure looks like and how it gets corrected. Anyone who publishes only smooth curves is either hiding their bad ones or has not run enough warmups to have any. Second, the recovery from a Yahoo hit on day 11 is not a magic technique. It is conservative volume reduction, patient observation, and resumption only after aggregate reports confirm the issue cleared, applied with the discipline of an operator who has watched the same shape replay across many clients. Three to five days lost. That is normal. Operators who claim faster recovery are doing one of two things wrong, and we have audited both kinds. What you are buying with a warmup engagement is the patience and the feedback discipline far more than the schedule itself.

─────────────────────────────────────────────────────────────────────────
03  /  What we won't do

No warmup networks. Not now, not at any price.

We get asked about warmup-network tools — services like Mailwarm or Lemwarm or Warmbox among a dozen others — at least twice a month. The answer is no. The reasons are technical and ethical, in that order.

Warmup-network tools work by adding your sending account to a pool of mailboxes that automatically open or click or reply to each other's mail. The pitch is that you build "engagement signals" that mailbox providers reward without needing real recipients. The reality is that the mailbox providers — Gmail and Microsoft especially — have been catching this pattern reliably since at least 2021. Validity's published analysis is direct on the subject: manufactured engagement lacks the normal negative signals (deletes, archives without reading, occasional spam marks) that real recipient behaviour produces. The traffic profile of a warmup-network mailbox is detectable, and traffic from senders associated with detectable warmup networks gets penalised.

The senders who use these tools and who do not get caught are typically running cold outreach at very low volumes from individual mailboxes — and the warmup network is propping up reputation that would not exist otherwise. The senders who try to use warmup networks for bulk marketing or transactional infrastructure burn the reputation faster than they would have without the tool. We have done two emergency recoveries this year for clients who tried to shortcut a warmup with a network tool — both took 60+ days of remediation work and one resulted in the IP being permanently retired because the reputation damage was not recoverable.

The reason real warmup works is not because mailbox providers are stupid. It is because they are evaluating a specific signal: does this sender have an audience that genuinely wants their mail? The proof of that is engagement from real recipients in their actual inbox patterns — opens followed by clicks followed by replies followed by occasional unsubscribes followed by re-engagement weeks later when something in the subject line catches them. That pattern cannot be faked at scale. It can only be earned by sending genuinely useful mail to people who asked for it, slowly, until the reputation builds.

The corollary is that we will not warm an IP for a sender who does not have a real engaged segment to send to in week one. If your list is purchased or scraped or so old that the 30-day engaged cohort is empty, the warmup will fail and there is no point starting it. We will tell you that during the intake call. The honest path forward in that case is list rebuilding — re-engagement campaign on a separate IP first, double opt-in for new sign-ups, then a fresh warmup once you have a 30-day engaged segment that is actually engaged.

─────────────────────────────────────────────────────────────────────────
04  /  What's included

Engineering hours, not a checklist.

The €199 per IP is engineering time, not a tooling subscription. Roughly two hours per IP per week of senior engineering attention across the warmup window — daily metrics review, schedule adjustment, blacklist monitoring, and the cohort work that makes the whole thing work.

Pre-warmup audit
List verification status review (we will run a sample through commercial verification at no charge — about 5,000 addresses), authentication verification (SPF, DKIM, DMARC live and aligned), domain reputation check, blacklist baseline check, engagement segment analysis. Issues here are fixed before day 1, not discovered in week 2.
Scheduled ramp
Day-by-day volume schedule customised for your target volume and your engaged-segment size. The 6-week version above is a typical starting point; senders targeting 100K/day complete in ~28 days, senders targeting 5M/day stretch to 56 days.
Daily metrics review
Every morning of the warmup, an engineer reads the previous 24h numbers — complaint rate, bounce rate, per-ISP placement from seed testing, Google Postmaster Tools reputation, Microsoft SNDS filter result. Decisions get made before the day's send goes out, not after.
Per-ISP throttling
Volume distributed proportionally across Gmail, Outlook, Yahoo, Apple, and the smaller European providers (ProtonMail, Tutanota, Mail.de, Orange.fr, Yandex) at the rates each receives in your eventual production mix. No "warm Gmail first, add Microsoft later" mistakes.
Engagement cohort segmentation
30-day engaged in weeks 1-2, 60-day in weeks 3-4, 90-day in weeks 5-6, full list thereafter. Cohorts identified from your engagement data — clicks weighted higher than opens because Apple Mail Privacy Protection makes opens unreliable.
Live blacklist monitoring
Continuous checks against Spamhaus SBL/CSS/XBL, Barracuda BRBL, Spamcop, SURBL, URIBL, and the dozen smaller lists that feed corporate filters. Listing during warmup gets remediated immediately — we have institutional relationships with the major operators that go back years.
Weekly written report
Every Monday, a written summary of the previous week's progress, current placement numbers per ISP, what we adjusted, and what to expect in the coming week. The format is short and specific — not an executive summary, an engineer's progress note.
Post-warmup handoff
When the IP completes the curve, we hand off with a documented sustaining plan — the daily volume floor and ceiling that maintains reputation, the alert thresholds we recommend, and the periodic re-engagement work that keeps the engaged-cohort fresh. Reputation is not "established" — it is maintained.
─────────────────────────────────────────────────────────────────────────
06  /  What it costs

€199 per IP, one-time.

Per-IP one-time pricing. Domain warmup runs in parallel where needed at no extra cost on the bundled tier. Multi-IP warmups discount on the third IP and beyond — we are not trying to be clever about it, larger engagements have lower per-IP overhead.

Single IP One IP, established domain
  • 4-6 week structured curve
  • Daily metrics review by named engineer
  • Per-ISP throttling and cohort segmentation
  • Live blacklist monitoring during ramp
  • Weekly written progress report
  • Post-warmup sustaining plan
  • Pre-warmup list audit (sample)
€199 one-time, per IP Start single IP →
Multi-IP fleet 3+ IPs, ESPs and platforms
  • 3 or more IPs warmed in coordinated cohort
  • Cross-IP traffic distribution strategy
  • IP pool architecture review and optimisation
  • Per-tenant warmup paths for ESP/platform use
  • €149 per IP from the third IP onward
  • Dedicated engineer through full engagement
  • Quote scales with IP count and target volume
From €547 3 IPs · then €149/IP Talk to sales →
─────────────────────────────────────────────────────────────────────────
07  /  Common questions

IP warmup, specifically.

Questions specific to dedicated-IP warmup engagements, the kind that come up once a customer has decided the volume justifies the IP. For broader topics — pricing structure, on-call coverage, jurisdiction selection — see the main FAQ.

01 My IP has been dormant for 45 days. Do I need to warm up again? +
Yes. The major mailbox providers track sender reputation on a rolling 30-day window — if you stop sending for more than that, the reputation effectively expires and the IP is treated as new on the next send. The good news is that re-warmup is faster than original warmup because the domain reputation usually persists for longer than IP reputation does, so the ramp can be more aggressive. We typically run dormant-IP re-warmups in 14 to 21 days rather than 28 to 42.
02 Can I warm up an IP for cold outreach? +
We do not warm up IPs that will be used for cold outreach as the primary sending pattern. The reasons are technical, not ideological: cold outreach to lists where the recipient has no prior relationship with the sender produces complaint rates 5-10× higher than opt-in marketing, and that profile is detectable from the first 24 hours of warmup. The warmup will fail because the underlying signal is wrong, not because the schedule is wrong. We will warm IPs for senders running a legitimate opt-in list with cold outreach as a small fraction (under 10%) of overall volume on a separate IP, but a dedicated cold-outreach IP is outside our service scope.
03 What if my warmup fails halfway through? +
Define "fails". If the metrics trend the wrong way (complaint rate climbing, deferral patterns increasing, blacklist listing), we pause the schedule, diagnose the root cause, and resume only when the underlying issue is fixed. The ramp gets extended by however many days the pause cost. That is normal — most warmups have at least one pause, and the discipline of pausing early is exactly what separates a successful warmup from a failed one. If the diagnosis reveals a structural problem (your list is dirty, your authentication is broken, your domain has an existing negative reputation), we tell you. The €199 covers our engineering time on the warmup we agreed to run; if the inputs need fixing first, that fix is its own engagement and we quote it separately.
04 Do you handle the domain warmup in parallel? +
On the €299 combined tier, yes. Domain reputation is a separate dimension from IP reputation — they are tracked independently by mailbox providers, and warming one without the other gets you partial trust. A new domain warming on a new IP needs both signals progressing together. The combined tier runs both schedules in parallel, with the same engagement-cohort discipline applied to the domain side. If you have an established domain (the case for most existing senders adding a new IP for capacity reasons), the Single IP tier at €199 is the right product — domain reputation transfers to the new IP automatically as it builds.
05 Can I do the warmup myself with your guidance? +
Yes — we offer it as a consulting engagement at €750 flat-fee for documentation and a one-hour weekly call. You execute the schedule on your end, send us the daily metrics, we diagnose anything that wobbles. This makes sense if you have an internal deliverability person who has done warmups before and just wants a second opinion on the schedule and a sounding board for the daily decisions. It does not make sense if you are figuring out warmup for the first time — the daily decisions need to be fast, and a one-hour weekly call is the wrong cadence for that. Most clients who try the consulting path during the first warmup switch to the managed path on subsequent IPs.
06 What does success look like? When do you call the warmup complete? +
Three conditions, all of which must hold for 7 consecutive days. Inbox placement at 95% or higher across the major receiving domains as measured by seed-list testing — Gmail, Outlook, Yahoo, Apple iCloud Mail, ProtonMail. Complaint rate under 0.1% sustained at the target daily volume. Bounce rate under 0.5% on the warmed list. Hitting these for a single day is meaningless; sustaining them through a full week of target-volume sends is what tells us the reputation is real. We will not declare a warmup complete just because the calendar says we are at day 42 — if the numbers are not there, we extend the ramp until they are. Conversely, if your list is small and the numbers are stable at day 28, we close out the engagement early.
─────────────────────────────────────────────────────────────────────────
08  /  Recovery playbook

When the warmup hits a wall.

Four failure modes show up in production warmups, with stable enough patterns that the recovery playbook is the same each time. Below: what each failure looks like, the SMTP signal or behavioural signal that confirms it, the recovery sequence we apply, and the realistic cost in days lost. The fourth case is the one nobody publishes — when recovery is not the right answer.

Four failure modes show up across warmup engagements. Each has a different recovery path and a different cost in days lost. The playbook below is what we run internally — same checklist for every client because the patterns are stable enough that improvisation makes things worse, not better. Knowing the recovery path before you need it is the difference between a three-day setback and a complete restart.

1 — Early Yahoo soft-block (days 8-14)

The first is the early Yahoo soft-block, usually surfacing between day 8 and day 14. Yahoo is the most aggressive of the major receivers about new IP scrutiny — they evaluate engagement signals harder and faster than Gmail or Microsoft, which means warmup IPs hit Yahoo throttling earlier than other receivers. The signal is 421 4.7.1 deferrals on a meaningful percentage of Yahoo-bound messages, with retry logic eventually delivering most of them but with delivery latency stretched to several hours. The recovery is to cut Yahoo-specific volume by 60 percent for 72 hours while leaving Gmail and Microsoft volume untouched. After 72 hours, resume Yahoo at the volume level from three days before the throttle hit, not the volume that triggered it. Resume the planned ramp from there. Total cost: three days lost, full ramp completion delayed by three days. We see this in roughly one out of every four warmups.

2 — Gmail Promotions tab drift (days 18-25)

Second is the Gmail Promotions tab drift around days 18 to 25. The signal here is subtle — messages still deliver, but the open rate on Gmail traffic drops by 15 to 25 percent compared to the trajectory of the first two weeks. What is happening is that Gmail's classifier has decided your IP is acceptable for delivery but not for the primary inbox, and it has started routing your traffic to Promotions instead. There is no SMTP signal for this. The only way to detect it is monitoring open rate by receiver, which most senders do not. The recovery is two-pronged: reduce Gmail volume by 30 percent for one week to lower the priority signal weight, and immediately review the message corpus for promotional indicators (excessive imagery, link density, sender-name personalisation gaps). Most Gmail Promotions drift recovers within seven to ten days. We see this in maybe one out of three warmups, more often in B2C marketing senders than B2B transactional.

3 — Microsoft SmartScreen full block (days 20-35)

Third is the Microsoft full block, the rarest of the three but the most disruptive. Microsoft's SmartScreen filter is binary in a way that Gmail's is not — when it decides an IP is suspect, the result is 550 5.7.0 with delivery rejection across hotmail.com, outlook.com, live.com, and the entire Microsoft consumer mail estate. There is no soft warning. The block usually appears between days 20 and 35 of the warmup, almost always triggered by complaint rate exceeding 0.3 percent in a 24-hour window. Recovery from a Microsoft block takes between five and ten days and requires three things: complete pause on Microsoft volume, complaint pattern analysis to identify the trigger campaign or list segment, and submission through Microsoft SNDS to acknowledge the issue and request reconsideration. The pause is non-negotiable. Senders who try to push through a Microsoft block extend the block.

4 — When recovery is not the answer

Fourth is total reputation collapse, where the IP is failing across two or more major receivers simultaneously. This is rare in our engagements because the warmup discipline catches the precursors, but it does happen — usually because a list quality problem the client did not disclose surfaces in week three or four. When it happens, the only honest answer is to stop the warmup and start over. We have advised clients to abandon a warmup IP after day 19 once. Once. The IP was permanently flagged across Microsoft and Yahoo, recovery would have taken six to eight weeks at best with uncertain outcome, and the client decided correctly to provision a fresh IP and start the 42-day warmup again. Total cost: 19 days plus a fresh ramp. The lesson is that recognising when not to recover is as much a skill as the recovery itself.

─────────────────────────────────────────────────────────────────────────

Skip the warmup, regret for 60 days.

The cost of skipping warmup is not what most senders think. It is not the first week of poor delivery — it is the 60 days of remediation work it takes to recover from the negative reputation that gets baked in. Warming properly the first time is materially cheaper and faster — and produces better long-term placement than recovering after a botched cold start.