🕷️ Crawler Inspector

URL Lookup

Direct Parameter Lookup

Raw Queries and Responses

1. Shard Calculation

Query:

Response:

Calculated Shard: 169 (from laksa105)

2. Crawled Status Check

Query:

curl -X POST \
  'http://laksa169.int.ahrefs:8124/' \
  -H 'Content-Type: text/plain' \
  -H 'X-ClickHouse-Database: crawler3' \
  -H 'Authorization: Basic YXBpOg==' \
  -d 'SELECT getAhrefsURLFromUnparsed(src_unparsed) AS found_url, ifNull(toUnixTimestamp(download_stamp), 0) AS crawl_time, ifNull(toUnixTimestamp(props_url_first_seen), 0) AS first_indexed_time, download_http_code AS http_code, src_unparsed AS src_unparsed, src_root_hash AS src_root_hash, history_drop_reason AS history_drop_reason, meta_title AS meta_title, meta_descriptions AS meta_descriptions, attrs_boilerpipe_text AS attrs_boilerpipe_text, attrs_markdown AS attrs_markdown, attrs_readable_markdown AS attrs_readable_markdown, meta_canonical AS meta_canonical, ml_categories_json AS ml_categories_json, ml_types_json AS ml_types_json, ml_intent_types_json AS ml_intent_types_json, meta_language AS meta_language, attrs_author AS attrs_author, ifNull(toUnixTimestamp(attrs_publish_time), 0) AS attrs_publish_time, ifNull(toUnixTimestamp(attrs_original_publish_time), 0) AS attrs_original_publish_time, ifNull(attrs_is_republished, 0) AS attrs_is_republished, ifNull(attrs_nr_words, 0) AS attrs_nr_words, ifNull(attrs_boilerpipe_nr_words, 0) AS attrs_boilerpipe_nr_words, ifNull(body_ext_links_number, 0) AS body_ext_links_number, ifNull(body_int_links_number, 0) AS body_int_links_number, ifNull(meta_nofollow, 0) AS meta_nofollow, ifNull(meta_noarchive, 0) AS meta_noarchive, ifNull(props_was_rendered, 0) AS props_was_rendered, ifNull(src_redirect, \'\') AS src_redirect, ifNull(download_time_msec, 0) AS download_time_msec, ifNull(download_ttfb_msec, 0) AS download_ttfb_msec, ifNull(download_size, 0) AS download_size FROM crawler3.page_info_local FINAL PREWHERE (src_root_hash, src_unparsed) IN ((getAhrefsRootHashFromUnparsed(getAhrefsUnparsedNoserviceFromURL(\'https://www.robertmylesmcdonnell.com/posts/bayes-books\')), getAhrefsUnparsedNoserviceFromURL(\'https://www.robertmylesmcdonnell.com/posts/bayes-books\'))) FORMAT JSONEachRow'

Response:

{"found_url":"https:\/\/www.robertmylesmcdonnell.com\/posts\/bayes-books","crawl_time":1776854254,"first_indexed_time":1714933443,"http_code":200,"src_unparsed":"com,robertmylesmcdonnell!www,\/posts\/bayes-books s443","src_root_hash":"17772841060222480569","history_drop_reason":null,"meta_title":"Rob McDonnell:Blog – Bayesian Stats: Book Recommendations","meta_descriptions":[],"attrs_boilerpipe_text":"The first time I came across Bayes’ Theorem\n1\n, I must admit I was pretty confused. It was in\nIntroductory Statistics\nby Neil A. Weiss, the course book in a statistics course I was taking at the time. Neither the logic of it nor the formula for it made much sense to me. For somebody new to probability, I was still trying to figure out what the hell\nP(A)\nactually\nmeant\n.\nLooking back, the funny thing is that it is the branch of statistics that\nisn’t\nwont to use Bayes’ Theorem that I find confusing\n2\n. Bayesian statistics now makes perfect sense to me. Indeed, it follows human intuition (even if the formula looks weird for anybody new to probability). The probability of the hypothesis we have in mind, given the data we observe (that would be the\n\\(P(A|B)\\)\npart, but we can rewrite it as\nPr(Hypothesis|Data)\n; this is also called the\nposterior\n) is… what exactly?\nWell, it’s a combination of the initial plausibility of our hypothesis (\nP(A)\n, also called the\nprior\n), multiplied by what’s called the likelihood (\nP(B|A)\nor\nPr(Data|Hypothesis)\n), which is the data we would expect to see if our hypothesis were correct. The denominator is often called the ‘evidence’ or something similarly opaque, but it is merely the numerator plus its converse, that is to say, the probability that our original hypothesis is wrong by the likelihood of our data, assuming that our hypothesis is wrong. In any case, its function is just to make sure our probabilities sum to one, as they should.\nAs you can see, all of those\nP(|)\nand\nPr\ncan make things confusing – underneath it all, it’s simpler. First of all, the\nPr()\nnotation and the denominator are often left out, making the theorem look more like this:\n\\[\n\\mathrm{posterior}\\propto\\mathrm{prior}\\times\\mathrm{likelihood}\n\\]\nThe plausibility of our idea, now that we have seen data, is proportional to its original plausibility times the likelihood. Well, reasonably simpler, but the fact is that most people are not comfortable thinking in terms of probability. Given that Bayes’ Theorem is the basic foundation block for an entire body of statistical literature\n3\n, one can see how things could get out of hand pretty quickly – hence the need for good books on the subject. I didn’t learn\nany\nBayesian statistics in any class I ever had, I learned everything I know from reading plenty of books, sometimes the whole way through, sometimes just certain parts until I got bored, from reading on the web, from making many mistakes in R – gradually, I found my way and built my understanding of Bayesian stats. Given that I read (or skimmed) quite a lot of books on the topic, I thought I’d share my two cents on those I came across. There are certainly many more, depending on the specific area of the sciences or on the level of technicality assumed, but these are the ones that I read and either loved, liked somewhat, got bored, or simply got lost (some of them are waaay too difficult… at least for me). Let’s dive in. They’re in no particular order, by the way.\nProbably the first book that comes to mind is one of the first that I read on the topic, Simon Jackman’s\nBayesian Analysis for the Social Sciences\n. Jackman’s book\n4\nhas a nice introduction to the topic and the first few chapters are reasonably easy to follow. However, his mathematics are detailed and even a bit pedantic (nothing wrong with that in an academic book) and things can get heavy going very quickly. Reading through derivations of the conjugacy of probability distributions convinced me I needed to a) go back and re-learn calculus\nagain\n(which I did), and b) go a little further back in the tree of books on Bayesian statistics.\nThis led me down a few interesting paths, from de Finetti’s famous “PROBABILITY DOES NOT EXIST” statement, which I originally saw in Jackman’s book and then hunted down the original (I enjoyed the start quite a lot but then got bored), to learning JAGS to go along with Jackman’s examples, to some rather unnecessary and heavy-going books\nTanner\n, for example). However, I decided to go the ‘source’ (in a modern context) and so I started reading Bernardo and Smith’s\nBayesian Theory\n, which really helped to give me a solid understanding of the concepts involved. It’s a detailed read, and well recommended if you want a deeper understanding of the concepts behind Bayesian statistics.\nMoving from concepts to application, I found Donald Berry’s\nStatistics: a Bayesian Perspective\nreally useful for getting a grip on the basic elements. The book is a bit dated now, and is targeted at an undergraduate audience, which is actually something in its favour. Up until Kruschke and McElreath’s books (mentioned later), most Bayesian stats books seemed to be aimed at people who were already experts in statistics (or knowledgeable, at least), with the aim of convincing them why they should switch to Bayesian methods. As a result, a lot of these books dive headlong into subjects that are not appropriate for most students (see Jackman’s conjugacy discussions, above) and have the effect of turning a lot of students off the material. Berry’s book, although limited, does the opposite.\nThere are other statistics books that cover Bayesian ideas, such as\nBolstad’s\n(I personally didn’t like his style) and deGroot & Schervish’s well-known\nbook\n, which is a fine book, but very dry for my taste. There are also other introductory Bayesian statistics books, such as those by\nLynch\nand\nHoff\n, neither of which really stuck with me. Actually, most Bayesian stats books I read didn’t stick with me. (\nLee\nalso has an introductory text, but I haven’t read it.)\nSticking with introductory level, John Kruschke has a popular\nbook\nwith its quirky dog cover.\nI came a bit late to the Kruschke party, which meant that the two biggest advantages of the book (easy explanation of Bayesian stats and pre-written R functions) were not particularly useful to me, as I was already reasonably proficient in R and understood Bayesian stats quite well. Still, his book is very popular for a reason, which shows just how prevalent the problem of ‘writing-Bayesian-stats-books-for-people-who-are-already-awesome-at-statistics’ is. However, I like to open things up and poke around, and his closed system of pre-written functions didn’t work so well for me (plus, the functions are quite badly written, in my opinion). Closed ecosystems of functions like this are a thing of the past (see\nLaplace’s Demon\n), the future is incorporating well-known methods and function calls with Bayesian machinery running under the hood (such as\nrstanarm\n). Anyway, for someone starting off, it’s a recommended read. Kruschke has some interesting\npapers\ntoo.\nThere are also some ‘classics’ of Bayesian statistics, perhaps the most well-known being the canonical\nBayesian Data Analysis\nby Andrew Gelman & co. I don’t know what keeps me away from this book. It’s very highly regarded (pretty much as\nthe\nbook on Bayesian stats) and well-written, and has a section on computation with\nStan\netc., but I just never seem to sit down and read it. Who knows why\n5\n.\nOf course, there are tons of books on this subject. I could literally fill a long blog post on the books I started and just didn’t like for whatever reason. There are some that are encyclopaedic (\nCongdon’s\nlong list of Bayesian books, for example), others, designed for business students, that were just kind of ‘meh’:\nMarin & Robert\nand\nRossi\n, for example. Others are useful for learning Bayesian methods\nand\nR, such as Jim Albert’s\nbook\n,\nBayesian Computation with R\n. This is a really useful book actually, but like I said earlier with reference to Kruschke, by the time I came to it, I was already using JAGS and on a different path and so I had no great use for Albert’s R functions. Still, it’s a well-regarded book by an acknowledged R expert. There are also good books on Bayesian econometrics (\nKoop\n) and time-series (\nPole, West & Harrison\n, haaard) and Jeff Gill has\none\nfor the social & behavioural sciences (I\nreally\ndidn’t get into this); there are also many others throughout the specific fields of the sciences.\nThe emphasis on R is something that has carried through to the newer batch of Bayesian statistics books, which place more emphasis on the ‘data analysis’ part (that is, being empirical instead of theoretical, and getting your hands dirty with computer programming in R from the off) than on theoretical underpinnings. You will find some targeted at a Python audience, for example, Davidson-Pilon’s\nbook\n, which is available as an editable github-type-webbook on the net, as well as a printed book. (Its title will tell you a bit about the Python audience it aims for – more computer programmers than the academic\/statistician audience that use R. Indeed, most of the Python Bayesian analysis resources are found on the web as opposed to in books. Although some are just books on the web in\npdf format\n.) For me personally, this is good news. I liked the theoretical knowledge that I gained from Bernardo & Smith, but once I had to delve into understanding matrix algebra (just to see a linear regression derivation) or complex characteristics of probability distributions, I was already thinking of how I’d rather have a beer. Programming, at least for me, is a perfect way to connect the theory and the concepts to the reality of actually\ndoing\nsome Bayesian analysis.\nThis brings me nicely to two books that I think really utilise this approach to good effect, one recent, one a decade or so old: Richard McElreath’s\nStatistical Rethinking\n(new) and Gelman & Hill’s\nData Analysis Using Regression and Multilevel\/Hierarchical Models\n(older). While I’ve only recently started reading McElreath’s book, it seems like\nexactly\nthe type of book that I would have liked to have when I started out. No complicated mathematics, just sensible advice and a heavy emphasis on doing analysis in R. The book is well-written (and very well backed up with references, there’s a ton of information to follow-up from the endnotes if you’re so inclined) and contains lucid arguments for why the author believes we need to approach statistics from a fresh angle. Although I haven’t finished it, I do recommend it already.\nIt’s similar in some ways to Gelman & Hill’s book, and one can see the influence of Andrew Gelman (particularly through his emphasis on the ‘doing’ of Bayesian statistics) in Statistical Rethinking.\nData Analysis Using…\nis likewise focused on analysis, learned through computer programming. It features both frequentist and Bayesian takes on statistical methods, and contains detailed computer code for (the now somewhat dated) BUGS language (see\nhere\nfor why BUGS and its cousin JAGS are not always optimal for Bayesian analysis). It also contains some sage advice for researchers: try out simple models using quick methods like\nlm()\nas you build up your model (advice that I certainly needed on at least one occasion).\nSo that’s my take on Bayesian statistics\/data analysis books. As befits the age we live in, you’ll likely learn just as much from sites like\nStack Overflow\nor from\nblog\nposts\nand\nRPubs\nand\nGithub\nthan you will from books. Academic papers often helped me more than books too. Still, a good book can teach you a hell of a lot in a consistent way. There are many referenced in this post, some better than others, but all have their qualities. For me specifically, someone who is not mad about reading lots of mathematics, the last two are my recommendations. For others, this will obviously be different (I know someone who loves Jackman’s book, for example).\nBy the way, there are some informal books on the subject, such as Nate Silver’s\nThe Signal and the Noise\n, or McGrayne’s\nThe Theory That Would Not Die\n. I’ve given Amazon or publisher links for all these books, bar a few, but they can be found in other places too…you know what I’m talking about. Buy the ones you like, though!\n👮\nFootnotes\nMaybe this formula wouldn’t have made much sense to the ol’ Reverend Thomas Bayes either, since he used the Newtonian style of geometric exposition. For the history of the theorem, see\nMcGrayne\n. For a history of statistics in general, including Bayes (and Laplace, who probably did much more to develop Bayesian statistics than Bayes ever did) see the fantastic\nStigler\n.\n↩︎\nThat would be ‘frequentist’ or ‘classical’ (or ‘traditional’) statistics (take your pick of adjective). Most of the Bayesian books above will have sections comparing the traditions. McElreath doesn’t bother, which is a nice development in its own way.\n↩︎\nAlthough there is only ever one core method: posterior \n.\n↩︎\nI’ve learned much more from Jackman’s various political science articles, in which he uses Bayesian methods, than this book, to be honest.\n↩︎\nSpeaking of Gelman, he has a literal treasure trove of papers and web discussions on the subject of Bayesian data analysis. See his\nsite\nfor many links to those.\n↩︎","attrs_markdown":"[Rob McDonnell:Blog](https:\/\/www.robertmylesmcdonnell.com\/)\n\n- [About](https:\/\/www.robertmylesmcdonnell.com\/about)\n\n# Bayesian Stats: Book Recommendations\nBayesian stats\n\nPublished\n\nMay 3, 2016\n\nThe first time I came across Bayes’ Theorem[1](https:\/\/www.robertmylesmcdonnell.com\/posts\/bayes-books#fn1), I must admit I was pretty confused. It was in [Introductory Statistics](https:\/\/www.amazon.com\/Introductory-Statistics-9th-Neil-Weiss\/dp\/0321691229\/ref=pd_cp_14_1?ie=UTF8&refRID=15HFJDGMBF49CXAD9W9S) by Neil A. Weiss, the course book in a statistics course I was taking at the time. Neither the logic of it nor the formula for it made much sense to me. For somebody new to probability, I was still trying to figure out what the hell *P(A)* actually *meant*.\n\nLooking back, the funny thing is that it is the branch of statistics that *isn’t* wont to use Bayes’ Theorem that I find confusing[2](https:\/\/www.robertmylesmcdonnell.com\/posts\/bayes-books#fn2). Bayesian statistics now makes perfect sense to me. Indeed, it follows human intuition (even if the formula looks weird for anybody new to probability). The probability of the hypothesis we have in mind, given the data we observe (that would be the \\\\(P(A\\|B)\\\\) part, but we can rewrite it as *Pr(Hypothesis\\|Data)*; this is also called the *posterior* ) is… what exactly?\n\nWell, it’s a combination of the initial plausibility of our hypothesis (*P(A)*, also called the *prior* ), multiplied by what’s called the likelihood (*P(B\\|A)* or *Pr(Data\\|Hypothesis)*), which is the data we would expect to see if our hypothesis were correct. The denominator is often called the ‘evidence’ or something similarly opaque, but it is merely the numerator plus its converse, that is to say, the probability that our original hypothesis is wrong by the likelihood of our data, assuming that our hypothesis is wrong. In any case, its function is just to make sure our probabilities sum to one, as they should.\n\nAs you can see, all of those *P(\\|)* and *Pr* can make things confusing – underneath it all, it’s simpler. First of all, the *Pr()* notation and the denominator are often left out, making the theorem look more like this:\n\n\\\\\\[ \\\\mathrm{posterior}\\\\propto\\\\mathrm{prior}\\\\times\\\\mathrm{likelihood} \\\\\\]\n\nThe plausibility of our idea, now that we have seen data, is proportional to its original plausibility times the likelihood. Well, reasonably simpler, but the fact is that most people are not comfortable thinking in terms of probability. Given that Bayes’ Theorem is the basic foundation block for an entire body of statistical literature[3](https:\/\/www.robertmylesmcdonnell.com\/posts\/bayes-books#fn3), one can see how things could get out of hand pretty quickly – hence the need for good books on the subject. I didn’t learn *any* Bayesian statistics in any class I ever had, I learned everything I know from reading plenty of books, sometimes the whole way through, sometimes just certain parts until I got bored, from reading on the web, from making many mistakes in R – gradually, I found my way and built my understanding of Bayesian stats. Given that I read (or skimmed) quite a lot of books on the topic, I thought I’d share my two cents on those I came across. There are certainly many more, depending on the specific area of the sciences or on the level of technicality assumed, but these are the ones that I read and either loved, liked somewhat, got bored, or simply got lost (some of them are waaay too difficult… at least for me). Let’s dive in. They’re in no particular order, by the way.\n\n![](https:\/\/i.imgur.com\/b0EkBG5.jpg?1)\n\nProbably the first book that comes to mind is one of the first that I read on the topic, Simon Jackman’s [Bayesian Analysis for the Social Sciences](https:\/\/onlinelibrary.wiley.com\/book\/10.1002\/9780470686621). Jackman’s book[4](https:\/\/www.robertmylesmcdonnell.com\/posts\/bayes-books#fn4) has a nice introduction to the topic and the first few chapters are reasonably easy to follow. However, his mathematics are detailed and even a bit pedantic (nothing wrong with that in an academic book) and things can get heavy going very quickly. Reading through derivations of the conjugacy of probability distributions convinced me I needed to a) go back and re-learn calculus *again* (which I did), and b) go a little further back in the tree of books on Bayesian statistics.\n\n![](https:\/\/i.imgur.com\/jpNFefC.png?1)\n\nThis led me down a few interesting paths, from de Finetti’s famous “PROBABILITY DOES NOT EXIST” statement, which I originally saw in Jackman’s book and then hunted down the original (I enjoyed the start quite a lot but then got bored), to learning JAGS to go along with Jackman’s examples, to some rather unnecessary and heavy-going books [Tanner](https:\/\/link.springer.com\/book\/10.1007%2F978-1-4612-4024-2), for example). However, I decided to go the ‘source’ (in a modern context) and so I started reading Bernardo and Smith’s [Bayesian Theory](https:\/\/onlinelibrary.wiley.com\/book\/10.1002\/9780470316870), which really helped to give me a solid understanding of the concepts involved. It’s a detailed read, and well recommended if you want a deeper understanding of the concepts behind Bayesian statistics.\n\nMoving from concepts to application, I found Donald Berry’s [Statistics: a Bayesian Perspective](https:\/\/www.amazon.com\/Statistics-Bayesian-Perspective-Donald-Berry\/dp\/0534234720) really useful for getting a grip on the basic elements. The book is a bit dated now, and is targeted at an undergraduate audience, which is actually something in its favour. Up until Kruschke and McElreath’s books (mentioned later), most Bayesian stats books seemed to be aimed at people who were already experts in statistics (or knowledgeable, at least), with the aim of convincing them why they should switch to Bayesian methods. As a result, a lot of these books dive headlong into subjects that are not appropriate for most students (see Jackman’s conjugacy discussions, above) and have the effect of turning a lot of students off the material. Berry’s book, although limited, does the opposite.\n\nThere are other statistics books that cover Bayesian ideas, such as [Bolstad’s](https:\/\/www.amazon.com\/Introduction-Bayesian-Statistics-William-Bolstad\/dp\/0470141158) (I personally didn’t like his style) and deGroot & Schervish’s well-known [book](https:\/\/www.pearsonhighered.com\/program\/De-Groot-Probability-and-Statistics-4th-Edition\/PGM146802.html), which is a fine book, but very dry for my taste. There are also other introductory Bayesian statistics books, such as those by [Lynch](https:\/\/www.springer.com\/us\/book\/9780387712642) and [Hoff](https:\/\/www.springer.com\/us\/book\/9780387922997), neither of which really stuck with me. Actually, most Bayesian stats books I read didn’t stick with me. ([Lee](https:\/\/www.amazon.com\/Bayesian-Statistics-Introduction-Peter-Lee\/dp\/1118332571) also has an introductory text, but I haven’t read it.)\n\nSticking with introductory level, John Kruschke has a popular [book](https:\/\/sites.google.com\/site\/doingbayesiandataanalysis\/) with its quirky dog cover.\n\n![](https:\/\/www.robertmylesmcdonnell.com\/posts\/images\/kruschkeDogs.jpg)\n\nI came a bit late to the Kruschke party, which meant that the two biggest advantages of the book (easy explanation of Bayesian stats and pre-written R functions) were not particularly useful to me, as I was already reasonably proficient in R and understood Bayesian stats quite well. Still, his book is very popular for a reason, which shows just how prevalent the problem of ‘writing-Bayesian-stats-books-for-people-who-are-already-awesome-at-statistics’ is. However, I like to open things up and poke around, and his closed system of pre-written functions didn’t work so well for me (plus, the functions are quite badly written, in my opinion). Closed ecosystems of functions like this are a thing of the past (see [Laplace’s Demon](https:\/\/cran.r-project.org\/web\/packages\/LaplacesDemon\/index.html)), the future is incorporating well-known methods and function calls with Bayesian machinery running under the hood (such as [rstanarm](https:\/\/cran.r-project.org\/web\/packages\/rstanarm\/vignettes\/rstanarm.html)). Anyway, for someone starting off, it’s a recommended read. Kruschke has some interesting [papers](https:\/\/www.indiana.edu\/~kruschke\/BEST\/BEST.pdf) too.\n\nThere are also some ‘classics’ of Bayesian statistics, perhaps the most well-known being the canonical [Bayesian Data Analysis](https:\/\/www.stat.columbia.edu\/~gelman\/book\/) by Andrew Gelman & co. I don’t know what keeps me away from this book. It’s very highly regarded (pretty much as *the* book on Bayesian stats) and well-written, and has a section on computation with [Stan](https:\/\/mc-stan.org\/) etc., but I just never seem to sit down and read it. Who knows why[5](https:\/\/www.robertmylesmcdonnell.com\/posts\/bayes-books#fn5).\n\n![](https:\/\/i.imgur.com\/hfaZkl3.png?1)\n\nOf course, there are tons of books on this subject. I could literally fill a long blog post on the books I started and just didn’t like for whatever reason. There are some that are encyclopaedic ([Congdon’s](https:\/\/webspace.qmul.ac.uk\/pcongdon\/) long list of Bayesian books, for example), others, designed for business students, that were just kind of ‘meh’: [Marin & Robert](https:\/\/www.springer.com\/us\/book\/9780387389837) and [Rossi](https:\/\/www.wiley.com\/WileyCDA\/WileyTitle\/productCd-0470863676.html), for example. Others are useful for learning Bayesian methods *and* R, such as Jim Albert’s [book](https:\/\/books.google.com.br\/books\/about\/Bayesian_Computation_with_R.html?id=kVk_WHFfIrMC&redir_esc=y), *Bayesian Computation with R*. This is a really useful book actually, but like I said earlier with reference to Kruschke, by the time I came to it, I was already using JAGS and on a different path and so I had no great use for Albert’s R functions. Still, it’s a well-regarded book by an acknowledged R expert. There are also good books on Bayesian econometrics ([Koop](https:\/\/www.wiley.com\/legacy\/wileychi\/koopbayesian\/)) and time-series ([Pole, West & Harrison](https:\/\/books.google.com.br\/books\/about\/Applied_Bayesian_Forecasting_and_Time_Se.html?id=LAcp-ZwnyxIC&redir_esc=y), haaard) and Jeff Gill has [one](https:\/\/www.crcpress.com\/Bayesian-Methods-A-Social-and-Behavioral-Sciences-Approach-Third-Edition\/Gill\/9781439862483) for the social & behavioural sciences (I *really* didn’t get into this); there are also many others throughout the specific fields of the sciences.\n\n![](https:\/\/i.imgur.com\/M7y9pgs.png?2)\n\nThe emphasis on R is something that has carried through to the newer batch of Bayesian statistics books, which place more emphasis on the ‘data analysis’ part (that is, being empirical instead of theoretical, and getting your hands dirty with computer programming in R from the off) than on theoretical underpinnings. You will find some targeted at a Python audience, for example, Davidson-Pilon’s [book](https:\/\/camdavidsonpilon.github.io\/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers\/), which is available as an editable github-type-webbook on the net, as well as a printed book. (Its title will tell you a bit about the Python audience it aims for – more computer programmers than the academic\/statistician audience that use R. Indeed, most of the Python Bayesian analysis resources are found on the web as opposed to in books. Although some are just books on the web in [pdf format](https:\/\/www.greenteapress.com\/thinkbayes\/thinkbayes.pdf).) For me personally, this is good news. I liked the theoretical knowledge that I gained from Bernardo & Smith, but once I had to delve into understanding matrix algebra (just to see a linear regression derivation) or complex characteristics of probability distributions, I was already thinking of how I’d rather have a beer. Programming, at least for me, is a perfect way to connect the theory and the concepts to the reality of actually *doing* some Bayesian analysis.\n\nThis brings me nicely to two books that I think really utilise this approach to good effect, one recent, one a decade or so old: Richard McElreath’s [Statistical Rethinking](https:\/\/xcelab.net\/rm\/statistical-rethinking\/) (new) and Gelman & Hill’s [Data Analysis Using Regression and Multilevel\/Hierarchical Models](https:\/\/www.stat.columbia.edu\/~gelman\/arm\/) (older). While I’ve only recently started reading McElreath’s book, it seems like **exactly** the type of book that I would have liked to have when I started out. No complicated mathematics, just sensible advice and a heavy emphasis on doing analysis in R. The book is well-written (and very well backed up with references, there’s a ton of information to follow-up from the endnotes if you’re so inclined) and contains lucid arguments for why the author believes we need to approach statistics from a fresh angle. Although I haven’t finished it, I do recommend it already.\n\nIt’s similar in some ways to Gelman & Hill’s book, and one can see the influence of Andrew Gelman (particularly through his emphasis on the ‘doing’ of Bayesian statistics) in Statistical Rethinking. *Data Analysis Using…* is likewise focused on analysis, learned through computer programming. It features both frequentist and Bayesian takes on statistical methods, and contains detailed computer code for (the now somewhat dated) BUGS language (see [here](https:\/\/robertmylesmcdonnell.com\/stan-or-jags-for-bayesian-ideal-point-irt) for why BUGS and its cousin JAGS are not always optimal for Bayesian analysis). It also contains some sage advice for researchers: try out simple models using quick methods like `lm()` as you build up your model (advice that I certainly needed on at least one occasion).\n\n![](https:\/\/i.imgur.com\/oH57M4t.png?1)\n\nSo that’s my take on Bayesian statistics\/data analysis books. As befits the age we live in, you’ll likely learn just as much from sites like [Stack Overflow](https:\/\/stackoverflow.com\/search?q=Bayesian+) or from [blog](https:\/\/darrenjw.wordpress.com\/2012\/11\/20\/getting-started-with-bayesian-variable-selection-using-jags-and-rjags\/) [posts](https:\/\/twiecki.github.io\/blog\/2015\/11\/10\/mcmc-sampling\/) and [RPubs](https:\/\/rpubs.com\/corey_sparks\/30893) and [Github](https:\/\/github.com\/robertmyles\/bayesian-ideal-point-irt-models) than you will from books. Academic papers often helped me more than books too. Still, a good book can teach you a hell of a lot in a consistent way. There are many referenced in this post, some better than others, but all have their qualities. For me specifically, someone who is not mad about reading lots of mathematics, the last two are my recommendations. For others, this will obviously be different (I know someone who loves Jackman’s book, for example).\n\n*By the way, there are some informal books on the subject, such as Nate Silver’s [The Signal and the Noise](https:\/\/www.penguinrandomhouse.com\/books\/305826\/the-signal-and-the-noise-by-nate-silver\/9780143125082\/), or McGrayne’s [The Theory That Would Not Die](https:\/\/www.amazon.ca\/Theory-That-Would-Not-Die\/dp\/0300169698\/181-0429523-7916830?ie=UTF8&tag=vglnk-ca-c250-20). I’ve given Amazon or publisher links for all these books, bar a few, but they can be found in other places too…you know what I’m talking about. Buy the ones you like, though\\!* 👮\n\n## Footnotes\n1. Maybe this formula wouldn’t have made much sense to the ol’ Reverend Thomas Bayes either, since he used the Newtonian style of geometric exposition. For the history of the theorem, see [McGrayne](https:\/\/www.amazon.ca\/Theory-That-Would-Not-Die\/dp\/0300169698\/181-0429523-7916830?ie=UTF8&tag=vglnk-ca-c250-20). For a history of statistics in general, including Bayes (and Laplace, who probably did much more to develop Bayesian statistics than Bayes ever did) see the fantastic [Stigler](https:\/\/www.hup.harvard.edu\/catalog.php?isbn=9780674403413).[↩︎](https:\/\/www.robertmylesmcdonnell.com\/posts\/bayes-books#fnref1)\n2. That would be ‘frequentist’ or ‘classical’ (or ‘traditional’) statistics (take your pick of adjective). Most of the Bayesian books above will have sections comparing the traditions. McElreath doesn’t bother, which is a nice development in its own way.[↩︎](https:\/\/www.robertmylesmcdonnell.com\/posts\/bayes-books#fnref2)\n3. Although there is only ever one core method: posterior .[↩︎](https:\/\/www.robertmylesmcdonnell.com\/posts\/bayes-books#fnref3)\n4. I’ve learned much more from Jackman’s various political science articles, in which he uses Bayesian methods, than this book, to be honest.[↩︎](https:\/\/www.robertmylesmcdonnell.com\/posts\/bayes-books#fnref4)\n5. Speaking of Gelman, he has a literal treasure trove of papers and web discussions on the subject of Bayesian data analysis. See his [site](https:\/\/andrewgelman.com\/) for many links to those.[↩︎](https:\/\/www.robertmylesmcdonnell.com\/posts\/bayes-books#fnref5)","attrs_readable_markdown":"The first time I came across Bayes’ Theorem[1](https:\/\/www.robertmylesmcdonnell.com\/posts\/bayes-books#fn1), I must admit I was pretty confused. It was in [Introductory Statistics](https:\/\/www.amazon.com\/Introductory-Statistics-9th-Neil-Weiss\/dp\/0321691229\/ref=pd_cp_14_1?ie=UTF8&refRID=15HFJDGMBF49CXAD9W9S) by Neil A. Weiss, the course book in a statistics course I was taking at the time. Neither the logic of it nor the formula for it made much sense to me. For somebody new to probability, I was still trying to figure out what the hell *P(A)* actually *meant*.\n\nLooking back, the funny thing is that it is the branch of statistics that *isn’t* wont to use Bayes’ Theorem that I find confusing[2](https:\/\/www.robertmylesmcdonnell.com\/posts\/bayes-books#fn2). Bayesian statistics now makes perfect sense to me. Indeed, it follows human intuition (even if the formula looks weird for anybody new to probability). The probability of the hypothesis we have in mind, given the data we observe (that would be the \\\\(P(A\\|B)\\\\) part, but we can rewrite it as *Pr(Hypothesis\\|Data)*; this is also called the *posterior* ) is… what exactly?\n\nWell, it’s a combination of the initial plausibility of our hypothesis (*P(A)*, also called the *prior* ), multiplied by what’s called the likelihood (*P(B\\|A)* or *Pr(Data\\|Hypothesis)*), which is the data we would expect to see if our hypothesis were correct. The denominator is often called the ‘evidence’ or something similarly opaque, but it is merely the numerator plus its converse, that is to say, the probability that our original hypothesis is wrong by the likelihood of our data, assuming that our hypothesis is wrong. In any case, its function is just to make sure our probabilities sum to one, as they should.\n\nAs you can see, all of those *P(\\|)* and *Pr* can make things confusing – underneath it all, it’s simpler. First of all, the *Pr()* notation and the denominator are often left out, making the theorem look more like this:\n\n\\\\\\[ \\\\mathrm{posterior}\\\\propto\\\\mathrm{prior}\\\\times\\\\mathrm{likelihood} \\\\\\]\n\nThe plausibility of our idea, now that we have seen data, is proportional to its original plausibility times the likelihood. Well, reasonably simpler, but the fact is that most people are not comfortable thinking in terms of probability. Given that Bayes’ Theorem is the basic foundation block for an entire body of statistical literature[3](https:\/\/www.robertmylesmcdonnell.com\/posts\/bayes-books#fn3), one can see how things could get out of hand pretty quickly – hence the need for good books on the subject. I didn’t learn *any* Bayesian statistics in any class I ever had, I learned everything I know from reading plenty of books, sometimes the whole way through, sometimes just certain parts until I got bored, from reading on the web, from making many mistakes in R – gradually, I found my way and built my understanding of Bayesian stats. Given that I read (or skimmed) quite a lot of books on the topic, I thought I’d share my two cents on those I came across. There are certainly many more, depending on the specific area of the sciences or on the level of technicality assumed, but these are the ones that I read and either loved, liked somewhat, got bored, or simply got lost (some of them are waaay too difficult… at least for me). Let’s dive in. They’re in no particular order, by the way.\n\n![](https:\/\/i.imgur.com\/b0EkBG5.jpg?1)\n\nProbably the first book that comes to mind is one of the first that I read on the topic, Simon Jackman’s [Bayesian Analysis for the Social Sciences](https:\/\/onlinelibrary.wiley.com\/book\/10.1002\/9780470686621). Jackman’s book[4](https:\/\/www.robertmylesmcdonnell.com\/posts\/bayes-books#fn4) has a nice introduction to the topic and the first few chapters are reasonably easy to follow. However, his mathematics are detailed and even a bit pedantic (nothing wrong with that in an academic book) and things can get heavy going very quickly. Reading through derivations of the conjugacy of probability distributions convinced me I needed to a) go back and re-learn calculus *again* (which I did), and b) go a little further back in the tree of books on Bayesian statistics.\n\n![](https:\/\/i.imgur.com\/jpNFefC.png?1)\n\nThis led me down a few interesting paths, from de Finetti’s famous “PROBABILITY DOES NOT EXIST” statement, which I originally saw in Jackman’s book and then hunted down the original (I enjoyed the start quite a lot but then got bored), to learning JAGS to go along with Jackman’s examples, to some rather unnecessary and heavy-going books [Tanner](https:\/\/link.springer.com\/book\/10.1007%2F978-1-4612-4024-2), for example). However, I decided to go the ‘source’ (in a modern context) and so I started reading Bernardo and Smith’s [Bayesian Theory](https:\/\/onlinelibrary.wiley.com\/book\/10.1002\/9780470316870), which really helped to give me a solid understanding of the concepts involved. It’s a detailed read, and well recommended if you want a deeper understanding of the concepts behind Bayesian statistics.\n\nMoving from concepts to application, I found Donald Berry’s [Statistics: a Bayesian Perspective](https:\/\/www.amazon.com\/Statistics-Bayesian-Perspective-Donald-Berry\/dp\/0534234720) really useful for getting a grip on the basic elements. The book is a bit dated now, and is targeted at an undergraduate audience, which is actually something in its favour. Up until Kruschke and McElreath’s books (mentioned later), most Bayesian stats books seemed to be aimed at people who were already experts in statistics (or knowledgeable, at least), with the aim of convincing them why they should switch to Bayesian methods. As a result, a lot of these books dive headlong into subjects that are not appropriate for most students (see Jackman’s conjugacy discussions, above) and have the effect of turning a lot of students off the material. Berry’s book, although limited, does the opposite.\n\nThere are other statistics books that cover Bayesian ideas, such as [Bolstad’s](https:\/\/www.amazon.com\/Introduction-Bayesian-Statistics-William-Bolstad\/dp\/0470141158) (I personally didn’t like his style) and deGroot & Schervish’s well-known [book](https:\/\/www.pearsonhighered.com\/program\/De-Groot-Probability-and-Statistics-4th-Edition\/PGM146802.html), which is a fine book, but very dry for my taste. There are also other introductory Bayesian statistics books, such as those by [Lynch](https:\/\/www.springer.com\/us\/book\/9780387712642) and [Hoff](https:\/\/www.springer.com\/us\/book\/9780387922997), neither of which really stuck with me. Actually, most Bayesian stats books I read didn’t stick with me. ([Lee](https:\/\/www.amazon.com\/Bayesian-Statistics-Introduction-Peter-Lee\/dp\/1118332571) also has an introductory text, but I haven’t read it.)\n\nSticking with introductory level, John Kruschke has a popular [book](https:\/\/sites.google.com\/site\/doingbayesiandataanalysis\/) with its quirky dog cover.\n\n![](https:\/\/www.robertmylesmcdonnell.com\/posts\/images\/kruschkeDogs.jpg)\n\nI came a bit late to the Kruschke party, which meant that the two biggest advantages of the book (easy explanation of Bayesian stats and pre-written R functions) were not particularly useful to me, as I was already reasonably proficient in R and understood Bayesian stats quite well. Still, his book is very popular for a reason, which shows just how prevalent the problem of ‘writing-Bayesian-stats-books-for-people-who-are-already-awesome-at-statistics’ is. However, I like to open things up and poke around, and his closed system of pre-written functions didn’t work so well for me (plus, the functions are quite badly written, in my opinion). Closed ecosystems of functions like this are a thing of the past (see [Laplace’s Demon](https:\/\/cran.r-project.org\/web\/packages\/LaplacesDemon\/index.html)), the future is incorporating well-known methods and function calls with Bayesian machinery running under the hood (such as [rstanarm](https:\/\/cran.r-project.org\/web\/packages\/rstanarm\/vignettes\/rstanarm.html)). Anyway, for someone starting off, it’s a recommended read. Kruschke has some interesting [papers](https:\/\/www.indiana.edu\/~kruschke\/BEST\/BEST.pdf) too.\n\nThere are also some ‘classics’ of Bayesian statistics, perhaps the most well-known being the canonical [Bayesian Data Analysis](https:\/\/www.stat.columbia.edu\/~gelman\/book\/) by Andrew Gelman & co. I don’t know what keeps me away from this book. It’s very highly regarded (pretty much as *the* book on Bayesian stats) and well-written, and has a section on computation with [Stan](https:\/\/mc-stan.org\/) etc., but I just never seem to sit down and read it. Who knows why[5](https:\/\/www.robertmylesmcdonnell.com\/posts\/bayes-books#fn5).\n\n![](https:\/\/i.imgur.com\/hfaZkl3.png?1)\n\nOf course, there are tons of books on this subject. I could literally fill a long blog post on the books I started and just didn’t like for whatever reason. There are some that are encyclopaedic ([Congdon’s](https:\/\/webspace.qmul.ac.uk\/pcongdon\/) long list of Bayesian books, for example), others, designed for business students, that were just kind of ‘meh’: [Marin & Robert](https:\/\/www.springer.com\/us\/book\/9780387389837) and [Rossi](https:\/\/www.wiley.com\/WileyCDA\/WileyTitle\/productCd-0470863676.html), for example. Others are useful for learning Bayesian methods *and* R, such as Jim Albert’s [book](https:\/\/books.google.com.br\/books\/about\/Bayesian_Computation_with_R.html?id=kVk_WHFfIrMC&redir_esc=y), *Bayesian Computation with R*. This is a really useful book actually, but like I said earlier with reference to Kruschke, by the time I came to it, I was already using JAGS and on a different path and so I had no great use for Albert’s R functions. Still, it’s a well-regarded book by an acknowledged R expert. There are also good books on Bayesian econometrics ([Koop](https:\/\/www.wiley.com\/legacy\/wileychi\/koopbayesian\/)) and time-series ([Pole, West & Harrison](https:\/\/books.google.com.br\/books\/about\/Applied_Bayesian_Forecasting_and_Time_Se.html?id=LAcp-ZwnyxIC&redir_esc=y), haaard) and Jeff Gill has [one](https:\/\/www.crcpress.com\/Bayesian-Methods-A-Social-and-Behavioral-Sciences-Approach-Third-Edition\/Gill\/9781439862483) for the social & behavioural sciences (I *really* didn’t get into this); there are also many others throughout the specific fields of the sciences.\n\n![](https:\/\/i.imgur.com\/M7y9pgs.png?2)\n\nThe emphasis on R is something that has carried through to the newer batch of Bayesian statistics books, which place more emphasis on the ‘data analysis’ part (that is, being empirical instead of theoretical, and getting your hands dirty with computer programming in R from the off) than on theoretical underpinnings. You will find some targeted at a Python audience, for example, Davidson-Pilon’s [book](https:\/\/camdavidsonpilon.github.io\/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers\/), which is available as an editable github-type-webbook on the net, as well as a printed book. (Its title will tell you a bit about the Python audience it aims for – more computer programmers than the academic\/statistician audience that use R. Indeed, most of the Python Bayesian analysis resources are found on the web as opposed to in books. Although some are just books on the web in [pdf format](https:\/\/www.greenteapress.com\/thinkbayes\/thinkbayes.pdf).) For me personally, this is good news. I liked the theoretical knowledge that I gained from Bernardo & Smith, but once I had to delve into understanding matrix algebra (just to see a linear regression derivation) or complex characteristics of probability distributions, I was already thinking of how I’d rather have a beer. Programming, at least for me, is a perfect way to connect the theory and the concepts to the reality of actually *doing* some Bayesian analysis.\n\nThis brings me nicely to two books that I think really utilise this approach to good effect, one recent, one a decade or so old: Richard McElreath’s [Statistical Rethinking](https:\/\/xcelab.net\/rm\/statistical-rethinking\/) (new) and Gelman & Hill’s [Data Analysis Using Regression and Multilevel\/Hierarchical Models](https:\/\/www.stat.columbia.edu\/~gelman\/arm\/) (older). While I’ve only recently started reading McElreath’s book, it seems like **exactly** the type of book that I would have liked to have when I started out. No complicated mathematics, just sensible advice and a heavy emphasis on doing analysis in R. The book is well-written (and very well backed up with references, there’s a ton of information to follow-up from the endnotes if you’re so inclined) and contains lucid arguments for why the author believes we need to approach statistics from a fresh angle. Although I haven’t finished it, I do recommend it already.\n\nIt’s similar in some ways to Gelman & Hill’s book, and one can see the influence of Andrew Gelman (particularly through his emphasis on the ‘doing’ of Bayesian statistics) in Statistical Rethinking. *Data Analysis Using…* is likewise focused on analysis, learned through computer programming. It features both frequentist and Bayesian takes on statistical methods, and contains detailed computer code for (the now somewhat dated) BUGS language (see [here](https:\/\/robertmylesmcdonnell.com\/stan-or-jags-for-bayesian-ideal-point-irt) for why BUGS and its cousin JAGS are not always optimal for Bayesian analysis). It also contains some sage advice for researchers: try out simple models using quick methods like `lm()` as you build up your model (advice that I certainly needed on at least one occasion).\n\n![](https:\/\/i.imgur.com\/oH57M4t.png?1)\n\nSo that’s my take on Bayesian statistics\/data analysis books. As befits the age we live in, you’ll likely learn just as much from sites like [Stack Overflow](https:\/\/stackoverflow.com\/search?q=Bayesian+) or from [blog](https:\/\/darrenjw.wordpress.com\/2012\/11\/20\/getting-started-with-bayesian-variable-selection-using-jags-and-rjags\/) [posts](https:\/\/twiecki.github.io\/blog\/2015\/11\/10\/mcmc-sampling\/) and [RPubs](https:\/\/rpubs.com\/corey_sparks\/30893) and [Github](https:\/\/github.com\/robertmyles\/bayesian-ideal-point-irt-models) than you will from books. Academic papers often helped me more than books too. Still, a good book can teach you a hell of a lot in a consistent way. There are many referenced in this post, some better than others, but all have their qualities. For me specifically, someone who is not mad about reading lots of mathematics, the last two are my recommendations. For others, this will obviously be different (I know someone who loves Jackman’s book, for example).\n\n*By the way, there are some informal books on the subject, such as Nate Silver’s [The Signal and the Noise](https:\/\/www.penguinrandomhouse.com\/books\/305826\/the-signal-and-the-noise-by-nate-silver\/9780143125082\/), or McGrayne’s [The Theory That Would Not Die](https:\/\/www.amazon.ca\/Theory-That-Would-Not-Die\/dp\/0300169698\/181-0429523-7916830?ie=UTF8&tag=vglnk-ca-c250-20). I’ve given Amazon or publisher links for all these books, bar a few, but they can be found in other places too…you know what I’m talking about. Buy the ones you like, though\\!* 👮\n\n## Footnotes\n1. Maybe this formula wouldn’t have made much sense to the ol’ Reverend Thomas Bayes either, since he used the Newtonian style of geometric exposition. For the history of the theorem, see [McGrayne](https:\/\/www.amazon.ca\/Theory-That-Would-Not-Die\/dp\/0300169698\/181-0429523-7916830?ie=UTF8&tag=vglnk-ca-c250-20). For a history of statistics in general, including Bayes (and Laplace, who probably did much more to develop Bayesian statistics than Bayes ever did) see the fantastic [Stigler](https:\/\/www.hup.harvard.edu\/catalog.php?isbn=9780674403413).[↩︎](https:\/\/www.robertmylesmcdonnell.com\/posts\/bayes-books#fnref1)\n2. That would be ‘frequentist’ or ‘classical’ (or ‘traditional’) statistics (take your pick of adjective). Most of the Bayesian books above will have sections comparing the traditions. McElreath doesn’t bother, which is a nice development in its own way.[↩︎](https:\/\/www.robertmylesmcdonnell.com\/posts\/bayes-books#fnref2)\n3. Although there is only ever one core method: posterior .[↩︎](https:\/\/www.robertmylesmcdonnell.com\/posts\/bayes-books#fnref3)\n4. I’ve learned much more from Jackman’s various political science articles, in which he uses Bayesian methods, than this book, to be honest.[↩︎](https:\/\/www.robertmylesmcdonnell.com\/posts\/bayes-books#fnref4)\n5. Speaking of Gelman, he has a literal treasure trove of papers and web discussions on the subject of Bayesian data analysis. See his [site](https:\/\/andrewgelman.com\/) for many links to those.[↩︎](https:\/\/www.robertmylesmcdonnell.com\/posts\/bayes-books#fnref5)","meta_canonical":null,"ml_categories_json":"","ml_types_json":"","ml_intent_types_json":"","meta_language":"en","attrs_author":null,"attrs_publish_time":0,"attrs_original_publish_time":1714933443,"attrs_is_republished":0,"attrs_nr_words":"2231","attrs_boilerpipe_nr_words":"2211","body_ext_links_number":48,"body_int_links_number":4,"meta_nofollow":0,"meta_noarchive":0,"props_was_rendered":0,"src_redirect":"","download_time_msec":594,"download_ttfb_msec":492,"download_size":12948}

3. Robots.txt Check

Query:

Response:

4. Spam/Ban Check

Query:

Response:

5. Seen Status Check

ℹ️ Skipped - page is already crawled

📄

INDEXABLE

✅

CRAWLED

5 days ago

🤖

ROBOTS ALLOWED

Page Info Filters

Filter	Status	Condition	Details
HTTP status	PASS	`download_http_code = 200`	HTTP 200
Age cutoff	PASS	`download_stamp > now() - 6 MONTH`	0.2 months ago
History drop	PASS	`isNull(history_drop_reason)`	No drop reason
Spam/ban	PASS	`fh_dont_index != 1 AND ml_spam_score = 0`	ml_spam_score=0
Canonical	PASS	`meta_canonical IS NULL OR = '' OR = src_unparsed`	Not set

Page Details

Property	Value
URL	https://www.robertmylesmcdonnell.com/posts/bayes-books
Last Crawled	2026-04-22 10:37:34 (5 days ago)
First Indexed	2024-05-05 18:24:03 (1 year ago)
HTTP Status Code	200
Content
Meta Title	Rob McDonnell:Blog – Bayesian Stats: Book Recommendations
Meta Description	null
Meta Canonical	null
Boilerpipe Text	The first time I came across Bayes’ Theorem 1 , I must admit I was pretty confused. It was in Introductory Statistics by Neil A. Weiss, the course book in a statistics course I was taking at the time. Neither the logic of it nor the formula for it made much sense to me. For somebody new to probability, I was still trying to figure out what the hell P(A) actually meant . Looking back, the funny thing is that it is the branch of statistics that isn’t wont to use Bayes’ Theorem that I find confusing 2 . Bayesian statistics now makes perfect sense to me. Indeed, it follows human intuition (even if the formula looks weird for anybody new to probability). The probability of the hypothesis we have in mind, given the data we observe (that would be the \(P(A\|B)\) part, but we can rewrite it as Pr(Hypothesis\|Data) ; this is also called the posterior ) is… what exactly? Well, it’s a combination of the initial plausibility of our hypothesis ( P(A) , also called the prior ), multiplied by what’s called the likelihood ( P(B\|A) or Pr(Data\|Hypothesis) ), which is the data we would expect to see if our hypothesis were correct. The denominator is often called the ‘evidence’ or something similarly opaque, but it is merely the numerator plus its converse, that is to say, the probability that our original hypothesis is wrong by the likelihood of our data, assuming that our hypothesis is wrong. In any case, its function is just to make sure our probabilities sum to one, as they should. As you can see, all of those P(\|) and Pr can make things confusing – underneath it all, it’s simpler. First of all, the Pr() notation and the denominator are often left out, making the theorem look more like this: \[ \mathrm{posterior}\propto\mathrm{prior}\times\mathrm{likelihood} \] The plausibility of our idea, now that we have seen data, is proportional to its original plausibility times the likelihood. Well, reasonably simpler, but the fact is that most people are not comfortable thinking in terms of probability. Given that Bayes’ Theorem is the basic foundation block for an entire body of statistical literature 3 , one can see how things could get out of hand pretty quickly – hence the need for good books on the subject. I didn’t learn any Bayesian statistics in any class I ever had, I learned everything I know from reading plenty of books, sometimes the whole way through, sometimes just certain parts until I got bored, from reading on the web, from making many mistakes in R – gradually, I found my way and built my understanding of Bayesian stats. Given that I read (or skimmed) quite a lot of books on the topic, I thought I’d share my two cents on those I came across. There are certainly many more, depending on the specific area of the sciences or on the level of technicality assumed, but these are the ones that I read and either loved, liked somewhat, got bored, or simply got lost (some of them are waaay too difficult… at least for me). Let’s dive in. They’re in no particular order, by the way. Probably the first book that comes to mind is one of the first that I read on the topic, Simon Jackman’s Bayesian Analysis for the Social Sciences . Jackman’s book 4 has a nice introduction to the topic and the first few chapters are reasonably easy to follow. However, his mathematics are detailed and even a bit pedantic (nothing wrong with that in an academic book) and things can get heavy going very quickly. Reading through derivations of the conjugacy of probability distributions convinced me I needed to a) go back and re-learn calculus again (which I did), and b) go a little further back in the tree of books on Bayesian statistics. This led me down a few interesting paths, from de Finetti’s famous “PROBABILITY DOES NOT EXIST” statement, which I originally saw in Jackman’s book and then hunted down the original (I enjoyed the start quite a lot but then got bored), to learning JAGS to go along with Jackman’s examples, to some rather unnecessary and heavy-going books Tanner , for example). However, I decided to go the ‘source’ (in a modern context) and so I started reading Bernardo and Smith’s Bayesian Theory , which really helped to give me a solid understanding of the concepts involved. It’s a detailed read, and well recommended if you want a deeper understanding of the concepts behind Bayesian statistics. Moving from concepts to application, I found Donald Berry’s Statistics: a Bayesian Perspective really useful for getting a grip on the basic elements. The book is a bit dated now, and is targeted at an undergraduate audience, which is actually something in its favour. Up until Kruschke and McElreath’s books (mentioned later), most Bayesian stats books seemed to be aimed at people who were already experts in statistics (or knowledgeable, at least), with the aim of convincing them why they should switch to Bayesian methods. As a result, a lot of these books dive headlong into subjects that are not appropriate for most students (see Jackman’s conjugacy discussions, above) and have the effect of turning a lot of students off the material. Berry’s book, although limited, does the opposite. There are other statistics books that cover Bayesian ideas, such as Bolstad’s (I personally didn’t like his style) and deGroot & Schervish’s well-known book , which is a fine book, but very dry for my taste. There are also other introductory Bayesian statistics books, such as those by Lynch and Hoff , neither of which really stuck with me. Actually, most Bayesian stats books I read didn’t stick with me. ( Lee also has an introductory text, but I haven’t read it.) Sticking with introductory level, John Kruschke has a popular book with its quirky dog cover. I came a bit late to the Kruschke party, which meant that the two biggest advantages of the book (easy explanation of Bayesian stats and pre-written R functions) were not particularly useful to me, as I was already reasonably proficient in R and understood Bayesian stats quite well. Still, his book is very popular for a reason, which shows just how prevalent the problem of ‘writing-Bayesian-stats-books-for-people-who-are-already-awesome-at-statistics’ is. However, I like to open things up and poke around, and his closed system of pre-written functions didn’t work so well for me (plus, the functions are quite badly written, in my opinion). Closed ecosystems of functions like this are a thing of the past (see Laplace’s Demon ), the future is incorporating well-known methods and function calls with Bayesian machinery running under the hood (such as rstanarm ). Anyway, for someone starting off, it’s a recommended read. Kruschke has some interesting papers too. There are also some ‘classics’ of Bayesian statistics, perhaps the most well-known being the canonical Bayesian Data Analysis by Andrew Gelman & co. I don’t know what keeps me away from this book. It’s very highly regarded (pretty much as the book on Bayesian stats) and well-written, and has a section on computation with Stan etc., but I just never seem to sit down and read it. Who knows why 5 . Of course, there are tons of books on this subject. I could literally fill a long blog post on the books I started and just didn’t like for whatever reason. There are some that are encyclopaedic ( Congdon’s long list of Bayesian books, for example), others, designed for business students, that were just kind of ‘meh’: Marin & Robert and Rossi , for example. Others are useful for learning Bayesian methods and R, such as Jim Albert’s book , Bayesian Computation with R . This is a really useful book actually, but like I said earlier with reference to Kruschke, by the time I came to it, I was already using JAGS and on a different path and so I had no great use for Albert’s R functions. Still, it’s a well-regarded book by an acknowledged R expert. There are also good books on Bayesian econometrics ( Koop ) and time-series ( Pole, West & Harrison , haaard) and Jeff Gill has one for the social & behavioural sciences (I really didn’t get into this); there are also many others throughout the specific fields of the sciences. The emphasis on R is something that has carried through to the newer batch of Bayesian statistics books, which place more emphasis on the ‘data analysis’ part (that is, being empirical instead of theoretical, and getting your hands dirty with computer programming in R from the off) than on theoretical underpinnings. You will find some targeted at a Python audience, for example, Davidson-Pilon’s book , which is available as an editable github-type-webbook on the net, as well as a printed book. (Its title will tell you a bit about the Python audience it aims for – more computer programmers than the academic/statistician audience that use R. Indeed, most of the Python Bayesian analysis resources are found on the web as opposed to in books. Although some are just books on the web in pdf format .) For me personally, this is good news. I liked the theoretical knowledge that I gained from Bernardo & Smith, but once I had to delve into understanding matrix algebra (just to see a linear regression derivation) or complex characteristics of probability distributions, I was already thinking of how I’d rather have a beer. Programming, at least for me, is a perfect way to connect the theory and the concepts to the reality of actually doing some Bayesian analysis. This brings me nicely to two books that I think really utilise this approach to good effect, one recent, one a decade or so old: Richard McElreath’s Statistical Rethinking (new) and Gelman & Hill’s Data Analysis Using Regression and Multilevel/Hierarchical Models (older). While I’ve only recently started reading McElreath’s book, it seems like exactly the type of book that I would have liked to have when I started out. No complicated mathematics, just sensible advice and a heavy emphasis on doing analysis in R. The book is well-written (and very well backed up with references, there’s a ton of information to follow-up from the endnotes if you’re so inclined) and contains lucid arguments for why the author believes we need to approach statistics from a fresh angle. Although I haven’t finished it, I do recommend it already. It’s similar in some ways to Gelman & Hill’s book, and one can see the influence of Andrew Gelman (particularly through his emphasis on the ‘doing’ of Bayesian statistics) in Statistical Rethinking. Data Analysis Using… is likewise focused on analysis, learned through computer programming. It features both frequentist and Bayesian takes on statistical methods, and contains detailed computer code for (the now somewhat dated) BUGS language (see here for why BUGS and its cousin JAGS are not always optimal for Bayesian analysis). It also contains some sage advice for researchers: try out simple models using quick methods like lm() as you build up your model (advice that I certainly needed on at least one occasion). So that’s my take on Bayesian statistics/data analysis books. As befits the age we live in, you’ll likely learn just as much from sites like Stack Overflow or from blog posts and RPubs and Github than you will from books. Academic papers often helped me more than books too. Still, a good book can teach you a hell of a lot in a consistent way. There are many referenced in this post, some better than others, but all have their qualities. For me specifically, someone who is not mad about reading lots of mathematics, the last two are my recommendations. For others, this will obviously be different (I know someone who loves Jackman’s book, for example). By the way, there are some informal books on the subject, such as Nate Silver’s The Signal and the Noise , or McGrayne’s The Theory That Would Not Die . I’ve given Amazon or publisher links for all these books, bar a few, but they can be found in other places too…you know what I’m talking about. Buy the ones you like, though! 👮 Footnotes Maybe this formula wouldn’t have made much sense to the ol’ Reverend Thomas Bayes either, since he used the Newtonian style of geometric exposition. For the history of the theorem, see McGrayne . For a history of statistics in general, including Bayes (and Laplace, who probably did much more to develop Bayesian statistics than Bayes ever did) see the fantastic Stigler . ↩︎ That would be ‘frequentist’ or ‘classical’ (or ‘traditional’) statistics (take your pick of adjective). Most of the Bayesian books above will have sections comparing the traditions. McElreath doesn’t bother, which is a nice development in its own way. ↩︎ Although there is only ever one core method: posterior . ↩︎ I’ve learned much more from Jackman’s various political science articles, in which he uses Bayesian methods, than this book, to be honest. ↩︎ Speaking of Gelman, he has a literal treasure trove of papers and web discussions on the subject of Bayesian data analysis. See his site for many links to those. ↩︎
Markdown	[Rob McDonnell:Blog](https://www.robertmylesmcdonnell.com/) - [About](https://www.robertmylesmcdonnell.com/about) # Bayesian Stats: Book Recommendations Bayesian stats Published May 3, 2016 The first time I came across Bayes’ Theorem[1](https://www.robertmylesmcdonnell.com/posts/bayes-books#fn1), I must admit I was pretty confused. It was in [Introductory Statistics](https://www.amazon.com/Introductory-Statistics-9th-Neil-Weiss/dp/0321691229/ref=pd_cp_14_1?ie=UTF8&refRID=15HFJDGMBF49CXAD9W9S) by Neil A. Weiss, the course book in a statistics course I was taking at the time. Neither the logic of it nor the formula for it made much sense to me. For somebody new to probability, I was still trying to figure out what the hell P(A) actually meant. Looking back, the funny thing is that it is the branch of statistics that isn’t wont to use Bayes’ Theorem that I find confusing[2](https://www.robertmylesmcdonnell.com/posts/bayes-books#fn2). Bayesian statistics now makes perfect sense to me. Indeed, it follows human intuition (even if the formula looks weird for anybody new to probability). The probability of the hypothesis we have in mind, given the data we observe (that would be the \\(P(A\\|B)\\) part, but we can rewrite it as Pr(Hypothesis\\|Data); this is also called the posterior ) is… what exactly? Well, it’s a combination of the initial plausibility of our hypothesis (P(A), also called the prior ), multiplied by what’s called the likelihood (P(B\\|A) or Pr(Data\\|Hypothesis)), which is the data we would expect to see if our hypothesis were correct. The denominator is often called the ‘evidence’ or something similarly opaque, but it is merely the numerator plus its converse, that is to say, the probability that our original hypothesis is wrong by the likelihood of our data, assuming that our hypothesis is wrong. In any case, its function is just to make sure our probabilities sum to one, as they should. As you can see, all of those P(\\|) and Pr can make things confusing – underneath it all, it’s simpler. First of all, the Pr() notation and the denominator are often left out, making the theorem look more like this: \\\[ \\mathrm{posterior}\\propto\\mathrm{prior}\\times\\mathrm{likelihood} \\\] The plausibility of our idea, now that we have seen data, is proportional to its original plausibility times the likelihood. Well, reasonably simpler, but the fact is that most people are not comfortable thinking in terms of probability. Given that Bayes’ Theorem is the basic foundation block for an entire body of statistical literature[3](https://www.robertmylesmcdonnell.com/posts/bayes-books#fn3), one can see how things could get out of hand pretty quickly – hence the need for good books on the subject. I didn’t learn any Bayesian statistics in any class I ever had, I learned everything I know from reading plenty of books, sometimes the whole way through, sometimes just certain parts until I got bored, from reading on the web, from making many mistakes in R – gradually, I found my way and built my understanding of Bayesian stats. Given that I read (or skimmed) quite a lot of books on the topic, I thought I’d share my two cents on those I came across. There are certainly many more, depending on the specific area of the sciences or on the level of technicality assumed, but these are the ones that I read and either loved, liked somewhat, got bored, or simply got lost (some of them are waaay too difficult… at least for me). Let’s dive in. They’re in no particular order, by the way. ![](https://i.imgur.com/b0EkBG5.jpg?1) Probably the first book that comes to mind is one of the first that I read on the topic, Simon Jackman’s [Bayesian Analysis for the Social Sciences](https://onlinelibrary.wiley.com/book/10.1002/9780470686621). Jackman’s book[4](https://www.robertmylesmcdonnell.com/posts/bayes-books#fn4) has a nice introduction to the topic and the first few chapters are reasonably easy to follow. However, his mathematics are detailed and even a bit pedantic (nothing wrong with that in an academic book) and things can get heavy going very quickly. Reading through derivations of the conjugacy of probability distributions convinced me I needed to a) go back and re-learn calculus again (which I did), and b) go a little further back in the tree of books on Bayesian statistics. ![](https://i.imgur.com/jpNFefC.png?1) This led me down a few interesting paths, from de Finetti’s famous “PROBABILITY DOES NOT EXIST” statement, which I originally saw in Jackman’s book and then hunted down the original (I enjoyed the start quite a lot but then got bored), to learning JAGS to go along with Jackman’s examples, to some rather unnecessary and heavy-going books [Tanner](https://link.springer.com/book/10.1007%2F978-1-4612-4024-2), for example). However, I decided to go the ‘source’ (in a modern context) and so I started reading Bernardo and Smith’s [Bayesian Theory](https://onlinelibrary.wiley.com/book/10.1002/9780470316870), which really helped to give me a solid understanding of the concepts involved. It’s a detailed read, and well recommended if you want a deeper understanding of the concepts behind Bayesian statistics. Moving from concepts to application, I found Donald Berry’s [Statistics: a Bayesian Perspective](https://www.amazon.com/Statistics-Bayesian-Perspective-Donald-Berry/dp/0534234720) really useful for getting a grip on the basic elements. The book is a bit dated now, and is targeted at an undergraduate audience, which is actually something in its favour. Up until Kruschke and McElreath’s books (mentioned later), most Bayesian stats books seemed to be aimed at people who were already experts in statistics (or knowledgeable, at least), with the aim of convincing them why they should switch to Bayesian methods. As a result, a lot of these books dive headlong into subjects that are not appropriate for most students (see Jackman’s conjugacy discussions, above) and have the effect of turning a lot of students off the material. Berry’s book, although limited, does the opposite. There are other statistics books that cover Bayesian ideas, such as [Bolstad’s](https://www.amazon.com/Introduction-Bayesian-Statistics-William-Bolstad/dp/0470141158) (I personally didn’t like his style) and deGroot & Schervish’s well-known [book](https://www.pearsonhighered.com/program/De-Groot-Probability-and-Statistics-4th-Edition/PGM146802.html), which is a fine book, but very dry for my taste. There are also other introductory Bayesian statistics books, such as those by [Lynch](https://www.springer.com/us/book/9780387712642) and [Hoff](https://www.springer.com/us/book/9780387922997), neither of which really stuck with me. Actually, most Bayesian stats books I read didn’t stick with me. ([Lee](https://www.amazon.com/Bayesian-Statistics-Introduction-Peter-Lee/dp/1118332571) also has an introductory text, but I haven’t read it.) Sticking with introductory level, John Kruschke has a popular [book](https://sites.google.com/site/doingbayesiandataanalysis/) with its quirky dog cover. ![](https://www.robertmylesmcdonnell.com/posts/images/kruschkeDogs.jpg) I came a bit late to the Kruschke party, which meant that the two biggest advantages of the book (easy explanation of Bayesian stats and pre-written R functions) were not particularly useful to me, as I was already reasonably proficient in R and understood Bayesian stats quite well. Still, his book is very popular for a reason, which shows just how prevalent the problem of ‘writing-Bayesian-stats-books-for-people-who-are-already-awesome-at-statistics’ is. However, I like to open things up and poke around, and his closed system of pre-written functions didn’t work so well for me (plus, the functions are quite badly written, in my opinion). Closed ecosystems of functions like this are a thing of the past (see [Laplace’s Demon](https://cran.r-project.org/web/packages/LaplacesDemon/index.html)), the future is incorporating well-known methods and function calls with Bayesian machinery running under the hood (such as [rstanarm](https://cran.r-project.org/web/packages/rstanarm/vignettes/rstanarm.html)). Anyway, for someone starting off, it’s a recommended read. Kruschke has some interesting [papers](https://www.indiana.edu/~kruschke/BEST/BEST.pdf) too. There are also some ‘classics’ of Bayesian statistics, perhaps the most well-known being the canonical [Bayesian Data Analysis](https://www.stat.columbia.edu/~gelman/book/) by Andrew Gelman & co. I don’t know what keeps me away from this book. It’s very highly regarded (pretty much as the book on Bayesian stats) and well-written, and has a section on computation with [Stan](https://mc-stan.org/) etc., but I just never seem to sit down and read it. Who knows why[5](https://www.robertmylesmcdonnell.com/posts/bayes-books#fn5). ![](https://i.imgur.com/hfaZkl3.png?1) Of course, there are tons of books on this subject. I could literally fill a long blog post on the books I started and just didn’t like for whatever reason. There are some that are encyclopaedic ([Congdon’s](https://webspace.qmul.ac.uk/pcongdon/) long list of Bayesian books, for example), others, designed for business students, that were just kind of ‘meh’: [Marin & Robert](https://www.springer.com/us/book/9780387389837) and [Rossi](https://www.wiley.com/WileyCDA/WileyTitle/productCd-0470863676.html), for example. Others are useful for learning Bayesian methods and R, such as Jim Albert’s [book](https://books.google.com.br/books/about/Bayesian_Computation_with_R.html?id=kVk_WHFfIrMC&redir_esc=y), Bayesian Computation with R. This is a really useful book actually, but like I said earlier with reference to Kruschke, by the time I came to it, I was already using JAGS and on a different path and so I had no great use for Albert’s R functions. Still, it’s a well-regarded book by an acknowledged R expert. There are also good books on Bayesian econometrics ([Koop](https://www.wiley.com/legacy/wileychi/koopbayesian/)) and time-series ([Pole, West & Harrison](https://books.google.com.br/books/about/Applied_Bayesian_Forecasting_and_Time_Se.html?id=LAcp-ZwnyxIC&redir_esc=y), haaard) and Jeff Gill has [one](https://www.crcpress.com/Bayesian-Methods-A-Social-and-Behavioral-Sciences-Approach-Third-Edition/Gill/9781439862483) for the social & behavioural sciences (I really didn’t get into this); there are also many others throughout the specific fields of the sciences. ![](https://i.imgur.com/M7y9pgs.png?2) The emphasis on R is something that has carried through to the newer batch of Bayesian statistics books, which place more emphasis on the ‘data analysis’ part (that is, being empirical instead of theoretical, and getting your hands dirty with computer programming in R from the off) than on theoretical underpinnings. You will find some targeted at a Python audience, for example, Davidson-Pilon’s [book](https://camdavidsonpilon.github.io/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/), which is available as an editable github-type-webbook on the net, as well as a printed book. (Its title will tell you a bit about the Python audience it aims for – more computer programmers than the academic/statistician audience that use R. Indeed, most of the Python Bayesian analysis resources are found on the web as opposed to in books. Although some are just books on the web in [pdf format](https://www.greenteapress.com/thinkbayes/thinkbayes.pdf).) For me personally, this is good news. I liked the theoretical knowledge that I gained from Bernardo & Smith, but once I had to delve into understanding matrix algebra (just to see a linear regression derivation) or complex characteristics of probability distributions, I was already thinking of how I’d rather have a beer. Programming, at least for me, is a perfect way to connect the theory and the concepts to the reality of actually doing some Bayesian analysis. This brings me nicely to two books that I think really utilise this approach to good effect, one recent, one a decade or so old: Richard McElreath’s [Statistical Rethinking](https://xcelab.net/rm/statistical-rethinking/) (new) and Gelman & Hill’s [Data Analysis Using Regression and Multilevel/Hierarchical Models](https://www.stat.columbia.edu/~gelman/arm/) (older). While I’ve only recently started reading McElreath’s book, it seems like exactly the type of book that I would have liked to have when I started out. No complicated mathematics, just sensible advice and a heavy emphasis on doing analysis in R. The book is well-written (and very well backed up with references, there’s a ton of information to follow-up from the endnotes if you’re so inclined) and contains lucid arguments for why the author believes we need to approach statistics from a fresh angle. Although I haven’t finished it, I do recommend it already. It’s similar in some ways to Gelman & Hill’s book, and one can see the influence of Andrew Gelman (particularly through his emphasis on the ‘doing’ of Bayesian statistics) in Statistical Rethinking. Data Analysis Using… is likewise focused on analysis, learned through computer programming. It features both frequentist and Bayesian takes on statistical methods, and contains detailed computer code for (the now somewhat dated) BUGS language (see [here](https://robertmylesmcdonnell.com/stan-or-jags-for-bayesian-ideal-point-irt) for why BUGS and its cousin JAGS are not always optimal for Bayesian analysis). It also contains some sage advice for researchers: try out simple models using quick methods like `lm()` as you build up your model (advice that I certainly needed on at least one occasion). ![](https://i.imgur.com/oH57M4t.png?1) So that’s my take on Bayesian statistics/data analysis books. As befits the age we live in, you’ll likely learn just as much from sites like [Stack Overflow](https://stackoverflow.com/search?q=Bayesian+) or from [blog](https://darrenjw.wordpress.com/2012/11/20/getting-started-with-bayesian-variable-selection-using-jags-and-rjags/) [posts](https://twiecki.github.io/blog/2015/11/10/mcmc-sampling/) and [RPubs](https://rpubs.com/corey_sparks/30893) and [Github](https://github.com/robertmyles/bayesian-ideal-point-irt-models) than you will from books. Academic papers often helped me more than books too. Still, a good book can teach you a hell of a lot in a consistent way. There are many referenced in this post, some better than others, but all have their qualities. For me specifically, someone who is not mad about reading lots of mathematics, the last two are my recommendations. For others, this will obviously be different (I know someone who loves Jackman’s book, for example). By the way, there are some informal books on the subject, such as Nate Silver’s [The Signal and the Noise](https://www.penguinrandomhouse.com/books/305826/the-signal-and-the-noise-by-nate-silver/9780143125082/), or McGrayne’s [The Theory That Would Not Die](https://www.amazon.ca/Theory-That-Would-Not-Die/dp/0300169698/181-0429523-7916830?ie=UTF8&tag=vglnk-ca-c250-20). I’ve given Amazon or publisher links for all these books, bar a few, but they can be found in other places too…you know what I’m talking about. Buy the ones you like, though\! 👮 ## Footnotes 1. Maybe this formula wouldn’t have made much sense to the ol’ Reverend Thomas Bayes either, since he used the Newtonian style of geometric exposition. For the history of the theorem, see [McGrayne](https://www.amazon.ca/Theory-That-Would-Not-Die/dp/0300169698/181-0429523-7916830?ie=UTF8&tag=vglnk-ca-c250-20). For a history of statistics in general, including Bayes (and Laplace, who probably did much more to develop Bayesian statistics than Bayes ever did) see the fantastic [Stigler](https://www.hup.harvard.edu/catalog.php?isbn=9780674403413).[↩︎](https://www.robertmylesmcdonnell.com/posts/bayes-books#fnref1) 2. That would be ‘frequentist’ or ‘classical’ (or ‘traditional’) statistics (take your pick of adjective). Most of the Bayesian books above will have sections comparing the traditions. McElreath doesn’t bother, which is a nice development in its own way.[↩︎](https://www.robertmylesmcdonnell.com/posts/bayes-books#fnref2) 3. Although there is only ever one core method: posterior .[↩︎](https://www.robertmylesmcdonnell.com/posts/bayes-books#fnref3) 4. I’ve learned much more from Jackman’s various political science articles, in which he uses Bayesian methods, than this book, to be honest.[↩︎](https://www.robertmylesmcdonnell.com/posts/bayes-books#fnref4) 5. Speaking of Gelman, he has a literal treasure trove of papers and web discussions on the subject of Bayesian data analysis. See his [site](https://andrewgelman.com/) for many links to those.[↩︎](https://www.robertmylesmcdonnell.com/posts/bayes-books#fnref5)
Readable Markdown	The first time I came across Bayes’ Theorem[1](https://www.robertmylesmcdonnell.com/posts/bayes-books#fn1), I must admit I was pretty confused. It was in [Introductory Statistics](https://www.amazon.com/Introductory-Statistics-9th-Neil-Weiss/dp/0321691229/ref=pd_cp_14_1?ie=UTF8&refRID=15HFJDGMBF49CXAD9W9S) by Neil A. Weiss, the course book in a statistics course I was taking at the time. Neither the logic of it nor the formula for it made much sense to me. For somebody new to probability, I was still trying to figure out what the hell P(A) actually meant. Looking back, the funny thing is that it is the branch of statistics that isn’t wont to use Bayes’ Theorem that I find confusing[2](https://www.robertmylesmcdonnell.com/posts/bayes-books#fn2). Bayesian statistics now makes perfect sense to me. Indeed, it follows human intuition (even if the formula looks weird for anybody new to probability). The probability of the hypothesis we have in mind, given the data we observe (that would be the \\(P(A\\|B)\\) part, but we can rewrite it as Pr(Hypothesis\\|Data); this is also called the posterior ) is… what exactly? Well, it’s a combination of the initial plausibility of our hypothesis (P(A), also called the prior ), multiplied by what’s called the likelihood (P(B\\|A) or Pr(Data\\|Hypothesis)), which is the data we would expect to see if our hypothesis were correct. The denominator is often called the ‘evidence’ or something similarly opaque, but it is merely the numerator plus its converse, that is to say, the probability that our original hypothesis is wrong by the likelihood of our data, assuming that our hypothesis is wrong. In any case, its function is just to make sure our probabilities sum to one, as they should. As you can see, all of those P(\\|) and Pr can make things confusing – underneath it all, it’s simpler. First of all, the Pr() notation and the denominator are often left out, making the theorem look more like this: \\\[ \\mathrm{posterior}\\propto\\mathrm{prior}\\times\\mathrm{likelihood} \\\] The plausibility of our idea, now that we have seen data, is proportional to its original plausibility times the likelihood. Well, reasonably simpler, but the fact is that most people are not comfortable thinking in terms of probability. Given that Bayes’ Theorem is the basic foundation block for an entire body of statistical literature[3](https://www.robertmylesmcdonnell.com/posts/bayes-books#fn3), one can see how things could get out of hand pretty quickly – hence the need for good books on the subject. I didn’t learn any Bayesian statistics in any class I ever had, I learned everything I know from reading plenty of books, sometimes the whole way through, sometimes just certain parts until I got bored, from reading on the web, from making many mistakes in R – gradually, I found my way and built my understanding of Bayesian stats. Given that I read (or skimmed) quite a lot of books on the topic, I thought I’d share my two cents on those I came across. There are certainly many more, depending on the specific area of the sciences or on the level of technicality assumed, but these are the ones that I read and either loved, liked somewhat, got bored, or simply got lost (some of them are waaay too difficult… at least for me). Let’s dive in. They’re in no particular order, by the way. ![](https://i.imgur.com/b0EkBG5.jpg?1) Probably the first book that comes to mind is one of the first that I read on the topic, Simon Jackman’s [Bayesian Analysis for the Social Sciences](https://onlinelibrary.wiley.com/book/10.1002/9780470686621). Jackman’s book[4](https://www.robertmylesmcdonnell.com/posts/bayes-books#fn4) has a nice introduction to the topic and the first few chapters are reasonably easy to follow. However, his mathematics are detailed and even a bit pedantic (nothing wrong with that in an academic book) and things can get heavy going very quickly. Reading through derivations of the conjugacy of probability distributions convinced me I needed to a) go back and re-learn calculus again (which I did), and b) go a little further back in the tree of books on Bayesian statistics. ![](https://i.imgur.com/jpNFefC.png?1) This led me down a few interesting paths, from de Finetti’s famous “PROBABILITY DOES NOT EXIST” statement, which I originally saw in Jackman’s book and then hunted down the original (I enjoyed the start quite a lot but then got bored), to learning JAGS to go along with Jackman’s examples, to some rather unnecessary and heavy-going books [Tanner](https://link.springer.com/book/10.1007%2F978-1-4612-4024-2), for example). However, I decided to go the ‘source’ (in a modern context) and so I started reading Bernardo and Smith’s [Bayesian Theory](https://onlinelibrary.wiley.com/book/10.1002/9780470316870), which really helped to give me a solid understanding of the concepts involved. It’s a detailed read, and well recommended if you want a deeper understanding of the concepts behind Bayesian statistics. Moving from concepts to application, I found Donald Berry’s [Statistics: a Bayesian Perspective](https://www.amazon.com/Statistics-Bayesian-Perspective-Donald-Berry/dp/0534234720) really useful for getting a grip on the basic elements. The book is a bit dated now, and is targeted at an undergraduate audience, which is actually something in its favour. Up until Kruschke and McElreath’s books (mentioned later), most Bayesian stats books seemed to be aimed at people who were already experts in statistics (or knowledgeable, at least), with the aim of convincing them why they should switch to Bayesian methods. As a result, a lot of these books dive headlong into subjects that are not appropriate for most students (see Jackman’s conjugacy discussions, above) and have the effect of turning a lot of students off the material. Berry’s book, although limited, does the opposite. There are other statistics books that cover Bayesian ideas, such as [Bolstad’s](https://www.amazon.com/Introduction-Bayesian-Statistics-William-Bolstad/dp/0470141158) (I personally didn’t like his style) and deGroot & Schervish’s well-known [book](https://www.pearsonhighered.com/program/De-Groot-Probability-and-Statistics-4th-Edition/PGM146802.html), which is a fine book, but very dry for my taste. There are also other introductory Bayesian statistics books, such as those by [Lynch](https://www.springer.com/us/book/9780387712642) and [Hoff](https://www.springer.com/us/book/9780387922997), neither of which really stuck with me. Actually, most Bayesian stats books I read didn’t stick with me. ([Lee](https://www.amazon.com/Bayesian-Statistics-Introduction-Peter-Lee/dp/1118332571) also has an introductory text, but I haven’t read it.) Sticking with introductory level, John Kruschke has a popular [book](https://sites.google.com/site/doingbayesiandataanalysis/) with its quirky dog cover. ![](https://www.robertmylesmcdonnell.com/posts/images/kruschkeDogs.jpg) I came a bit late to the Kruschke party, which meant that the two biggest advantages of the book (easy explanation of Bayesian stats and pre-written R functions) were not particularly useful to me, as I was already reasonably proficient in R and understood Bayesian stats quite well. Still, his book is very popular for a reason, which shows just how prevalent the problem of ‘writing-Bayesian-stats-books-for-people-who-are-already-awesome-at-statistics’ is. However, I like to open things up and poke around, and his closed system of pre-written functions didn’t work so well for me (plus, the functions are quite badly written, in my opinion). Closed ecosystems of functions like this are a thing of the past (see [Laplace’s Demon](https://cran.r-project.org/web/packages/LaplacesDemon/index.html)), the future is incorporating well-known methods and function calls with Bayesian machinery running under the hood (such as [rstanarm](https://cran.r-project.org/web/packages/rstanarm/vignettes/rstanarm.html)). Anyway, for someone starting off, it’s a recommended read. Kruschke has some interesting [papers](https://www.indiana.edu/~kruschke/BEST/BEST.pdf) too. There are also some ‘classics’ of Bayesian statistics, perhaps the most well-known being the canonical [Bayesian Data Analysis](https://www.stat.columbia.edu/~gelman/book/) by Andrew Gelman & co. I don’t know what keeps me away from this book. It’s very highly regarded (pretty much as the book on Bayesian stats) and well-written, and has a section on computation with [Stan](https://mc-stan.org/) etc., but I just never seem to sit down and read it. Who knows why[5](https://www.robertmylesmcdonnell.com/posts/bayes-books#fn5). ![](https://i.imgur.com/hfaZkl3.png?1) Of course, there are tons of books on this subject. I could literally fill a long blog post on the books I started and just didn’t like for whatever reason. There are some that are encyclopaedic ([Congdon’s](https://webspace.qmul.ac.uk/pcongdon/) long list of Bayesian books, for example), others, designed for business students, that were just kind of ‘meh’: [Marin & Robert](https://www.springer.com/us/book/9780387389837) and [Rossi](https://www.wiley.com/WileyCDA/WileyTitle/productCd-0470863676.html), for example. Others are useful for learning Bayesian methods and R, such as Jim Albert’s [book](https://books.google.com.br/books/about/Bayesian_Computation_with_R.html?id=kVk_WHFfIrMC&redir_esc=y), Bayesian Computation with R. This is a really useful book actually, but like I said earlier with reference to Kruschke, by the time I came to it, I was already using JAGS and on a different path and so I had no great use for Albert’s R functions. Still, it’s a well-regarded book by an acknowledged R expert. There are also good books on Bayesian econometrics ([Koop](https://www.wiley.com/legacy/wileychi/koopbayesian/)) and time-series ([Pole, West & Harrison](https://books.google.com.br/books/about/Applied_Bayesian_Forecasting_and_Time_Se.html?id=LAcp-ZwnyxIC&redir_esc=y), haaard) and Jeff Gill has [one](https://www.crcpress.com/Bayesian-Methods-A-Social-and-Behavioral-Sciences-Approach-Third-Edition/Gill/9781439862483) for the social & behavioural sciences (I really didn’t get into this); there are also many others throughout the specific fields of the sciences. ![](https://i.imgur.com/M7y9pgs.png?2) The emphasis on R is something that has carried through to the newer batch of Bayesian statistics books, which place more emphasis on the ‘data analysis’ part (that is, being empirical instead of theoretical, and getting your hands dirty with computer programming in R from the off) than on theoretical underpinnings. You will find some targeted at a Python audience, for example, Davidson-Pilon’s [book](https://camdavidsonpilon.github.io/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/), which is available as an editable github-type-webbook on the net, as well as a printed book. (Its title will tell you a bit about the Python audience it aims for – more computer programmers than the academic/statistician audience that use R. Indeed, most of the Python Bayesian analysis resources are found on the web as opposed to in books. Although some are just books on the web in [pdf format](https://www.greenteapress.com/thinkbayes/thinkbayes.pdf).) For me personally, this is good news. I liked the theoretical knowledge that I gained from Bernardo & Smith, but once I had to delve into understanding matrix algebra (just to see a linear regression derivation) or complex characteristics of probability distributions, I was already thinking of how I’d rather have a beer. Programming, at least for me, is a perfect way to connect the theory and the concepts to the reality of actually doing some Bayesian analysis. This brings me nicely to two books that I think really utilise this approach to good effect, one recent, one a decade or so old: Richard McElreath’s [Statistical Rethinking](https://xcelab.net/rm/statistical-rethinking/) (new) and Gelman & Hill’s [Data Analysis Using Regression and Multilevel/Hierarchical Models](https://www.stat.columbia.edu/~gelman/arm/) (older). While I’ve only recently started reading McElreath’s book, it seems like exactly the type of book that I would have liked to have when I started out. No complicated mathematics, just sensible advice and a heavy emphasis on doing analysis in R. The book is well-written (and very well backed up with references, there’s a ton of information to follow-up from the endnotes if you’re so inclined) and contains lucid arguments for why the author believes we need to approach statistics from a fresh angle. Although I haven’t finished it, I do recommend it already. It’s similar in some ways to Gelman & Hill’s book, and one can see the influence of Andrew Gelman (particularly through his emphasis on the ‘doing’ of Bayesian statistics) in Statistical Rethinking. Data Analysis Using… is likewise focused on analysis, learned through computer programming. It features both frequentist and Bayesian takes on statistical methods, and contains detailed computer code for (the now somewhat dated) BUGS language (see [here](https://robertmylesmcdonnell.com/stan-or-jags-for-bayesian-ideal-point-irt) for why BUGS and its cousin JAGS are not always optimal for Bayesian analysis). It also contains some sage advice for researchers: try out simple models using quick methods like `lm()` as you build up your model (advice that I certainly needed on at least one occasion). ![](https://i.imgur.com/oH57M4t.png?1) So that’s my take on Bayesian statistics/data analysis books. As befits the age we live in, you’ll likely learn just as much from sites like [Stack Overflow](https://stackoverflow.com/search?q=Bayesian+) or from [blog](https://darrenjw.wordpress.com/2012/11/20/getting-started-with-bayesian-variable-selection-using-jags-and-rjags/) [posts](https://twiecki.github.io/blog/2015/11/10/mcmc-sampling/) and [RPubs](https://rpubs.com/corey_sparks/30893) and [Github](https://github.com/robertmyles/bayesian-ideal-point-irt-models) than you will from books. Academic papers often helped me more than books too. Still, a good book can teach you a hell of a lot in a consistent way. There are many referenced in this post, some better than others, but all have their qualities. For me specifically, someone who is not mad about reading lots of mathematics, the last two are my recommendations. For others, this will obviously be different (I know someone who loves Jackman’s book, for example). By the way, there are some informal books on the subject, such as Nate Silver’s [The Signal and the Noise](https://www.penguinrandomhouse.com/books/305826/the-signal-and-the-noise-by-nate-silver/9780143125082/), or McGrayne’s [The Theory That Would Not Die](https://www.amazon.ca/Theory-That-Would-Not-Die/dp/0300169698/181-0429523-7916830?ie=UTF8&tag=vglnk-ca-c250-20). I’ve given Amazon or publisher links for all these books, bar a few, but they can be found in other places too…you know what I’m talking about. Buy the ones you like, though\! 👮 ## Footnotes 1. Maybe this formula wouldn’t have made much sense to the ol’ Reverend Thomas Bayes either, since he used the Newtonian style of geometric exposition. For the history of the theorem, see [McGrayne](https://www.amazon.ca/Theory-That-Would-Not-Die/dp/0300169698/181-0429523-7916830?ie=UTF8&tag=vglnk-ca-c250-20). For a history of statistics in general, including Bayes (and Laplace, who probably did much more to develop Bayesian statistics than Bayes ever did) see the fantastic [Stigler](https://www.hup.harvard.edu/catalog.php?isbn=9780674403413).[↩︎](https://www.robertmylesmcdonnell.com/posts/bayes-books#fnref1) 2. That would be ‘frequentist’ or ‘classical’ (or ‘traditional’) statistics (take your pick of adjective). Most of the Bayesian books above will have sections comparing the traditions. McElreath doesn’t bother, which is a nice development in its own way.[↩︎](https://www.robertmylesmcdonnell.com/posts/bayes-books#fnref2) 3. Although there is only ever one core method: posterior .[↩︎](https://www.robertmylesmcdonnell.com/posts/bayes-books#fnref3) 4. I’ve learned much more from Jackman’s various political science articles, in which he uses Bayesian methods, than this book, to be honest.[↩︎](https://www.robertmylesmcdonnell.com/posts/bayes-books#fnref4) 5. Speaking of Gelman, he has a literal treasure trove of papers and web discussions on the subject of Bayesian data analysis. See his [site](https://andrewgelman.com/) for many links to those.[↩︎](https://www.robertmylesmcdonnell.com/posts/bayes-books#fnref5)
ML Classification
ML Categories	null
ML Page Types	null
ML Intent Types	null
Content Metadata
Language	en
Author	null
Publish Time	not set
Original Publish Time	2024-05-05 18:24:03 (1 year ago)
Republished	No
Word Count (Total)	2,231
Word Count (Content)	2,211
Links
External Links	48
Internal Links	4
Technical SEO
Meta Nofollow	No
Meta Noarchive	No
JS Rendered	No
Redirect Target	null
Performance
Download Time (ms)	594
TTFB (ms)	492
Download Size (bytes)	12,948
Shard	169 (laksa)
Root Hash	17772841060222480569
Unparsed URL	com,robertmylesmcdonnell!www,/posts/bayes-books s443