Portfolio allocation for retail P2P lenders: the weekly LLM review that catches concentration drift
A 15-minute weekly LLM review prompt that scores a retail P2P portfolio across five concentration axes and produces a watch-list of the loans most likely to default next quarter.
This piece extends the broader P2P lending with AI playbook to the specific question retail lenders ask once they have more than thirty loans on the books: am I as diversified as the platform’s dashboard says I am, and what is quietly drifting?
The honest answer in almost every portfolio we have looked at is no, and several things are. Individual loans rarely sink retail P2P returns. Concentration drift does — same employer cluster, same geography, same vintage cohort, same gig-platform of origin — and it does so on a timeline of 12 to 18 months, long enough that the cohort responsible for the eventual losses looks fine when you first deploy.
The fix is not a smarter scoring model. It is a structured weekly check that reads your raw CSV the way the platform dashboard does not.
The five axes that actually matter
Most P2P platform dashboards show two cuts: loan count and grade distribution. Both are largely useless for concentration purposes. A portfolio of 80 loans evenly spread across five grades looks diversified on grade and looks well-spread on count. It can still be 60% concentrated in delivery riders working for one of two gig platforms in two adjacent metros — a single shock to those platforms will land on a majority of the book at once.
The five axes worth tracking weekly:
Employer. The single biggest hidden concentration in retail P2P. Borrowers from the same employer tend to come through the platform in clusters — the platform’s marketing or referral mechanics naturally produce them — and a layoff at that employer hits all of them at once. Track exposure per employer name as a percent of deployed capital.
Geography. Pin code, postal code, ZIP, district — whichever your CSV gives you, aggregate to a meaningful regional level. A natural disaster, a regional employer’s collapse, or a localised regulatory change creates correlated default. The right granularity is country-specific: for the US, MSA; for the UK, two-letter postcode area; for India, district; for Brazil, mesoregion.
Platform / origination channel. If you lend across multiple P2P platforms, this matters as a single axis. If you are on one platform but it sources borrowers from multiple gig partners, those partner names become sub-channels. A platform-level event — fee structure change, new regulation, marketing channel collapse — can affect every loan from that source simultaneously.
Vintage cohort. Loans originated in the same calendar quarter share macro conditions. The Q2 2026 vintage in many markets carried more risk than the Q4 2025 vintage because of shifting underwriting standards on a few major platforms. Track exposure per origination quarter.
Loan grade. The platform’s risk grade is imperfect but informative when read alongside the others. A portfolio that is 70% grade-A but 80% from one employer is more dangerous than its grade distribution suggests.
The platforms that show one or two of these in their dashboards are the better ones. The rest leave it to you.
Why platform diversification scores are insufficient
Platform diversification scores almost always count loans, not exposures. A portfolio with one USD 5,000 loan and twenty USD 250 loans counts as 21 loans on the dashboard. That same portfolio is 50% exposed to one borrower. Any honest concentration metric is dollar-weighted, not loan-weighted, and most platform UIs simply don’t compute it.
The platforms with the best dashboards still typically miss the cross-axis cuts. They might show employer concentration. They might separately show geography concentration. They almost never show employer-and-geography overlap — twenty borrowers from one employer in one metro, the worst possible joint exposure, presenting cleanly on each individual axis but disastrous when you cross-tabulate.
This is the gap a structured prompt against the raw CSV closes.
The weekly review prompt
The prompt below assumes a CSV export with one row per loan and at minimum these columns: loan_id, principal_outstanding, employer_name, region, origination_quarter, grade, platform_or_channel, status, days_past_due. Add more if you have them.
You are a retail P2P portfolio analyst. Analyse the CSV below.
Step 1 — Compute total deployed capital (sum of principal_outstanding for
loans where status is "active" or "current" or "past_due_under_30").
Step 2 — For each of these axes, produce a concentration table showing the
top 10 buckets by share of total deployed capital:
a. employer_name
b. region
c. platform_or_channel
d. origination_quarter (vintage)
e. grade
Step 3 — For each axis, flag any bucket exceeding the threshold:
- employer: 8% of deployed capital
- region: 15%
- platform/channel: 40%
- vintage: 25%
- grade: 50%
Step 4 — Compute joint concentrations: for the top 3 employers and top 3
regions, show the cross-tab. Flag any cell above 5%.
Step 5 — Produce a watch-list of the 3 loans most likely to default in the
next 90 days, using these signals in order of weight:
- days_past_due > 0
- employer_name in a flagged employer bucket
- region in a flagged region bucket
- vintage in a flagged vintage bucket
- grade C or below
For each loan, write one sentence on why it's on the list.
Step 6 — End with two short sections: "What's drifted since last week"
(if a previous report is appended below) and "What I would not deploy
into next week."
Output as Markdown with tables. USD figures, no currency conversion.
A worked example — the synthetic portfolio
We built a synthetic 120-loan retail portfolio and seeded it with two hidden concentrations: a 14-borrower employer cluster (“RouteCart Logistics”, a fictional gig-delivery operator), and a vintage tilt where 38% of deployed capital is in a single Q1 2026 cohort. On the platform dashboard, this portfolio looks balanced — 4 grades represented, 11 regions, no obvious red flag.
The prompt’s output:
| Axis | Top bucket | % of capital | Flag? |
|---|---|---|---|
| Employer | RouteCart Logistics | 11.2% | Yes (>8%) |
| Region | Greater Manila | 17.4% | Yes (>15%) |
| Platform | Platform A | 100% | Yes (>40%) — single platform |
| Vintage | 2026-Q1 | 38.1% | Yes (>25%) |
| Grade | Grade B | 47.6% | No |
The cross-tab between top employers and top regions surfaces the worst joint exposure: 7.8% of capital is in RouteCart riders in Greater Manila, in a portfolio whose overall RouteCart exposure is 11.2%. That joint cell exceeds the 5% flag — a regional shock to the gig-delivery sector in one metro lands on a meaningful share of the book.
The watch-list: three loans ID’d by status (one already 14 days past due), employer cluster (two RouteCart loans in flagged combinations), and vintage. Each gets a one-sentence rationale. The “what I would not deploy into next week” section names RouteCart and Q1 2026 originations explicitly. It does not refuse the platform overall — that’s a separate decision — but it sets a deployment ceiling.
Total time from CSV export to actionable output: 11 minutes on a recent Claude or GPT model. The bulk of the time is the human reading the output, not the model generating it.
Reading the output
Three concrete actions a retail lender takes off this report each week:
A flag on a single axis is a heads-up. A flag on the joint cell of two axes is a stop-deploying-into-this signal. The Manila × RouteCart cell breaching 5% means the next loan to a RouteCart rider in Manila should be skipped, not because the borrower is uncreditworthy, but because the marginal exposure is wrong.
A loan on the watch-list does not need to be sold or refinanced — usually it cannot be on a retail P2P platform anyway. It needs to be excluded from any reinvestment of repayments. The cure is to let the watch-listed loans run their course while redirecting fresh capital elsewhere.
Drift since last week is the most actionable section, and it requires you to keep last week’s report. Append it under the new CSV in the prompt and the model can compare. A bucket that grew from 9% to 11% is the kind of thing the platform never tells you and that compounds.
What the LLM gets wrong
Three failure modes worth naming:
Recency bias on defaults. If your CSV includes a default flag, the model overweights very recent defaults when picking watch-list candidates. A loan that defaulted yesterday gets the model’s attention; a structurally identical loan in the same employer cluster that has not yet defaulted gets less. Tune by either stripping the default flag from the CSV or instructing the prompt explicitly to weight forward-looking signal over backward-looking signal.
Prepayment blindness. The model has no view on which loans are likely to prepay early. Prepayments collapse expected interest and reshape concentration arithmetically — a high-paying borrower prepaying their loan does not improve your exposure, it removes capital from a useful diversifier. The prompt cannot see this. Track prepayment rates separately.
Off-platform obligations. The model sees only what is in the CSV. A borrower with three loans on three different P2P platforms looks like one loan on yours. Where regulation permits (most do), check whether your platform shares cross-platform borrower data and ask for that field on the export. If it isn’t available, accept that this gap exists.
A simple sanity check we run alongside the prompt: spot-check the top three loans in the largest employer bucket. Read each borrower’s profile manually. If the model’s watch-list does not include the one that worries you most, your prompt’s signal weights are wrong, not the borrower.
A regional reality check
The prompt is platform-agnostic, but the data fields and the regulatory frame around P2P differ by region.
In the United Kingdom, FCA-regulated peer-to-peer platforms are governed by PS19/14 and follow-on rules. Investor-protection requirements mean platform exports are typically richer than elsewhere. In the European Union, the European Crowdfunding Service Providers Regulation (ECSPR) standardises disclosure across member states; expect comparable export quality. In the United States, marketplace lending falls under the CFPB’s market-monitoring lens and varies by state for retail-investor access. In India, RBI’s Master Directions on NBFC-P2P platforms regulate the platforms themselves; an individual lender on those platforms is unregulated as a lender, though tax treatment varies. In Australia, ASIC frames P2P under the credit-licensing regime. In Brazil, the CVM’s Resolução 88 governs crowdfunding. In Kenya and across parts of Africa, mobile-money-integrated P2P operates under nascent frameworks; export quality is typically the lowest.
The prompt itself does not change. The fields available do. Build the prompt around the lowest-common-denominator schema (loan_id, principal, employer, region, vintage, grade, status, dpd) and let it use additional fields if present.
Where to go from here
If you want a deeper concentration framework — including the dollar-weighted exposure templates and the cross-axis stress tests retail platforms don’t run for you — the AI Credit Scorecard Template ships with a portfolio-allocation tab built on the same five-axis logic.
Next read: the broader P2P-with-AI playbook — how this weekly review fits inside the rest of the retail-lender stack.
Frequently asked questions
How concentrated is too concentrated for a retail P2P portfolio?
A working rule across regions: no single employer above 8% of deployed capital, no single geography above 15%, no single platform above 40%, no single vintage cohort above 25%, no single loan-grade above 50%. Breach two of these simultaneously and you have a real exposure problem, not a theoretical one. Retail lenders with portfolios under 30 active loans almost always breach at least one — the rules are guardrails, not commandments, and the point is to know when you are inside or outside them.
Can an LLM actually see concentration risk a platform's own dashboard misses?
Yes, because most platform dashboards count loans, not exposures, and they almost never group across the axes that matter — employer, geography, gig-platform of origin, vintage cohort. An LLM working on the raw CSV cuts the data however the prompt asks, including cuts the platform never built a UI for. The LLM does not have proprietary data the platform lacks. It has the same data, presented differently, with the cuts that actually matter.
How big does my P2P portfolio need to be before this weekly review is worth the time?
Around 25 active loans is where the review starts paying for itself. Below that, concentration is mathematically forced — every loan is a meaningful share — and the cure is more loans, not more analysis. Between 25 and 100 loans, the weekly LLM review costs 15 minutes and surfaces drift you would otherwise miss. Above 100 loans, you almost certainly need a small dashboard layer in addition, because the watch-list mechanic gets unwieldy.
Sources
- Principles for the management of credit risk · Basel Committee on Banking Supervision
- Loan-based crowdfunding platforms: Feedback to CP18/20 and final rules (PS19/14) · Financial Conduct Authority
- Marketplace Lending — research reports · Consumer Financial Protection Bureau
- Master Directions — Non-Banking Financial Company (Peer-to-Peer Lending Platform) · Reserve Bank of India