Vintage Curve Update — 06/12/2008

Brief Explanation: These curves show the entire set of prosper loan broken down by credit grade and lined up along the x axis on their origination date…  As a loan goes late (1 month or worse) it is counted (either by amount or by count) as late against the population…  The curves stop when the loan population falls below 250 (ie there are 249 or less loans that age or older)…

Recently a study from the University of Maryland claimed a peak default date around month 10 of a Prosper loan.  This would translate into the largest delta  in this graph over a month period.  Does this graph confirm or deny that statement?  Is it conclusive?  Please leave a comment.

Here is the vintage curves by count (click graph for larger version)…

Vintage Curves By Count

Here is the vintage curves by amount (larger loan go late at a higher rate and therefore on a percentage basis you would expect an increase), (click graph for larger version)…

Vintage Curves By Amount

Here is the SQL that I used to pull the underlying data out of the public and private data downloads

DECLARE @DTD int
SET @DTD=30
SELECT
cast(aday-originationdate as int) as 'PIT',
l.creditgrade,
sum(PrincipalBalance+NetDefaults) as 'Amount',
count(l.[key]) as 'Count',
sum(case WHEN (mld.DPD!=0 and
       (mld.DPD+(aday-observationdate))>@DTD) THEN
            PrincipalBalance+NetDefaults ELSE 0 END) as 'AmountLate',
sum(case WHEN (mld.DPD!=0 and
       (mld.DPD+(aday-observationdate))>@DTD) THEN
           PrincipalBalance+NetDefaults ELSE 0 END)/
           sum(PrincipalBalance+NetDefaults) as AmountLatePercentage,
sum(case WHEN (mld.DPD!=0 and
     (mld.DPD+(aday-observationdate))>@DTD) THEN
        1 ELSE 0 END) as 'CountLate',
sum(case WHEN (mld.DPD!=0 and
       (mld.DPD+(aday-observationdate))>@DTD) THEN
       1.0 ELSE 0.0 END)/count(l.[key]) as 'CountLatePercentage'
FROM
loan l
inner join creditprofile cp on cp.listingkey=l.listingkey
inner join LoanPerformance mld on l.[key]=mld.loankey cross join alldays
where
mld.observationdate = ( select top 1 observationdate
from LoanPerformance sub
where sub.observationdate < aday
and sub.loankey=mld.loankey order by sub.observationdate DESC )
and aday < getDate()
and aday >= '02/01/2006'
and l.creditgrade!='NC'
group by
cast(aday-originationdate as int),
l.creditgrade
having
count(l.[key])>250 and
sum(PrincipalBalance+NetDefaults)>0
order by
'PIT'
Related Stores If you liked this article, vote for it on del.icio.us and stumbleupon.


Categories:

Prosper.com, Statistics, Strategy, Tools



Tags:

, , , , , , , , , , , ,


3 comments ↓
#1 BB on 06.13.08 at 9:58 am

Well, your curves do seem to be the steepest around 300 days… Whether or not this is because of the high default rate around 10 mo can’t be seen from this. Older loans have a lower default rate as could be expected from i) because they originate from before 10 mo ago or ii) because they’re older and all ‘bad’ people have already defaulted long ago. Default rate of younger loans is averaged over origination dates before, at and after 10 mo ago, that masks about everything.

What we really need is a default rate by origination date, I know that is not so easy and at the same time gives the analyst room for fiddling around. ‘Day’s past origination date’ is something else than ‘origination date’ as there is data for ‘60 days past origination date’ for a loan originated in November 2006 as well as for a loan originating in January 2008.

Thanks for the data! Its good and inspiring work and very useful for further analysis.

#2 JS on 06.20.08 at 4:43 am

A difficulty with using these curves to estimate yield is that they treat paid-up loans as dropouts and omit them from time periods after they’ve been paid. But a paid-up loan isn’t really a dropout. It’s a guaranteed non-default. A pool of 100 people where 99 people pay their loans in two years and the third defaults is a better investment than a pool where nobody pays their loan early but 10 people default after year two, even though the formula used for these graphs the first pool will calculate a 100% default rate (1 out of 1) between years 2 and 3 while the second pool will show a 10% default rate (10 out of 100) in the same period.

In order to obtain a realistic default rate for purposes of figuring out whether a credit group is a good investment or not, a paid-up loan needs to be treated as a non-default and kept in the denominator from the time it is paid to the end of the 3-year period. A paid-up loan results in a smaller gain than originally expected, but no loss.

To address the comparable origination date issue, a version of the Kaplan-Meier approach needs to be used — loans should not appear in the denominator for a period older than their age.

#3 RateLadder on 06.20.08 at 7:00 am

I am not change the denominator based on status. The denominator only changes based length of time pased since origination…

Leave a Comment

Email Updates