Our Testing Methodology

Methodology

The long version of how we test, score, weigh, and update reviews — including how we balance our own findings against AliExpress buyer reviews, how we handle conflicts of interest, and when we walk away from a review entirely.

100-point rubric · 5 weighted categories · Cross-check against 50–100 buyer reviews · 30-day update window · ~33% “skip it” verdict rate · One human tester

This page exists because the shorter How We Review page glosses over the trade-offs. The methodology below is what we actually do, including the parts that are messy or imperfect.

1. Why we trust this process

The rubric started as a list of every way we got burned in our first six months of buying on AliExpress. We were the customer first. We lost about $200 in that window. The five scoring categories each map to a specific failure mode:

  • Build quality (25 pts): for the $14 lamp that fell apart in a week.
  • Performance (30 pts): for the “10,000mAh” power bank that delivered 3,800mAh on the meter.
  • Value for price (20 pts): for the items that were fine but were also $20 in the same listing one month earlier.
  • Match to description (15 pts): for the “stainless steel” knife that was painted aluminum.
  • Safety and packaging (10 pts): for the USB charger that smoked the second time we plugged it in.

Performance gets the heaviest weight because that’s where the biggest dollar mistakes happen. Safety gets the smallest numerical weight only because it’s a kill switch. A fail on safety drops the whole product to a “do not buy” verdict regardless of the other scores. We’ve used the kill switch six times so far.

2. How we pick which products to test

We don’t run a request queue and pull randomly. We pick products that meet all four of these conditions:

  1. Demand. The product, or close variants, has at least 1,000 monthly orders on AliExpress.
  2. Review density. At least 200 buyer reviews exist across the top three listings, so we can read a real pattern.
  3. Listing confusion. The top 10 listings for the search term include at least three near-identical products at significantly different prices. If a buyer can pick blindly and be fine, we don’t need to be there.
  4. We can actually test it. We won’t review industrial equipment, prescription items, or anything that requires specialized lab gear. We test consumer goods we can use in a normal household.

About one-third of products that pass the demand filter fail one of the others, usually #3. That’s fine. We’d rather publish 4 useful reviews a month than 20 generic ones.

3. The hands-on testing protocol

Once a product arrives, we go through the same protocol:

  • Unboxing on camera. We film the box opening, packaging condition, included parts, and any obvious damage. The video stays. We never re-shoot a “cleaner” unboxing later.
  • Spec verification. We measure what the listing claims. Weight on a kitchen scale. Dimensions with a ruler. Battery capacity with a USB meter. Brightness with a lux meter for lights. Noise with a phone-based decibel meter (calibrated against a known source). We note tolerances honestly: “listing says 1,200mAh, meter shows 950mAh ±10%.”
  • Real use. Small items: 1 hour of normal use. Daily-use items (massagers, lamps, kitchen tools): minimum 7 days. Long-life items (cables, tools): minimum 30 days before we publish, then re-evaluation at 90 days.
  • Stress test. We try the product slightly outside spec. Charge to 100% and leave it. Run a fan on max for 4 hours. Twist a cable repeatedly. We’re not trying to break things for sport. We’re testing how the product handles the way real buyers actually use it.
  • Score with notes. Each criterion gets a number and a one-sentence justification. The justification is more important than the number; it’s what readers cite back to us when they disagree.

4. Why we trust AliExpress reviews (and where we don’t)

AliExpress buyer reviews are not garbage. They are noisy. Most are short (“good,” “fast shipping”). A meaningful minority are detailed, photographed, and specific. Our cross-check process treats them as a noisy data source we read carefully.

What we trust them for: patterns. If 8 out of 100 recent reviews mention the same defect (“button stops working after a week”), that’s a real signal even if any one review is unreliable. We trust them for shipping experience and seller responsiveness, because those are things a buyer experiences honestly. We trust photographed reviews more than text-only.

What we don’t trust them for: aggregate star ratings, especially on listings under 6 months old. Sellers boost early ratings with promotions and incentivized reviews, and the platform’s review-purchase verification has known gaps. We look at the distribution and the 1-and-2-star reviews much more than the headline 4.7 number.

When our test contradicts the buyer-review pattern, we say so explicitly in the review and try to explain why. “Most reviewers say battery is fine. Our unit died at 60% of rated capacity. Possible batch defect. Updating in 30 days.” Honesty beats false certainty.

5. Conflict-of-interest policy

We earn affiliate commission when readers buy through outbound links. The full mechanics are on the affiliate disclosure page. The methodology-relevant parts:

  • We do not accept free or discounted products in exchange for coverage. Every reviewed item is bought at retail with our money.
  • We do not accept payment from sellers, agencies, or platforms for inclusion in any list, ranking, or recommendation.
  • If a seller contacts us during a review cycle, we either decline contact or note it in the review.
  • The commission rate is the same across products within a category, so there is no incentive to recommend one product over another for financial reasons. There is, however, an incentive to recommend something rather than say “skip the whole category.” We push back on that incentive by tracking how often our verdict is “skip it.” It currently runs around 33%, and we’d be uncomfortable if it ever dropped below 20%.

6. When we decline to publish

Sometimes we test a product and don’t publish. Reasons we’ve actually used:

  • Our test unit arrived broken and we couldn’t get a replacement in time. We won’t publish a verdict based on a single broken unit.
  • The product category turned out to be too volatile (listings change weekly), and any review would be out of date within a month. (Phone cases are our worst offender here.)
  • The product is functionally identical to a higher-quality item from a known brand sold at a similar price elsewhere. In that case, the honest review is “buy the brand-name version” and we don’t have an affiliate link for the better choice. We say it anyway in a short post and move on.
  • The product has safety concerns we can’t fully verify (mostly electricals without proper certification). We don’t want to be the source someone cites when they plug in something that catches fire.

7. When and how we update

Every published review has a “Last reviewed” date and a separate “Last verified” date. We re-verify reviews on a rolling schedule:

  • Price-trigger update: if the current price differs from our tested price by more than 20%, the review goes back into the queue and gets revised within 30 days.
  • Seller-change update: if the seller we tested is no longer the top result, we re-evaluate within 30 days, sometimes re-test if the new seller is different enough.
  • Reader-flagged update: if we get two or more reader reports of an issue we didn’t see, we re-test or note it openly within 14 days.
  • Annual sweep: every review older than 12 months gets a full re-verification regardless of triggers, and a “still valid” or “withdrawn” stamp added.

This is a small operation, so we miss the targets sometimes. When we do, we say so on the review page rather than backdating the verification stamp. To meet the people who actually do the work, see the team page. For the simple summary of all this, the How We Review page is shorter.