DevonThink for Archival Research

In a post about the wonders of ABBYY FineScanner back in May, I promised to write about another pillar of my archival process, the database management program, DevonThink Pro Office. Like ABBYY FineScanner, it’s quite pricey ($149.95 after a 150-hour test-drive), but coming up on 15 months together I couldn’t imagine my life without it.

I should say at the outset that I can claim no particular expertise with regard to this program. I have no doubt that someone with more technical skill could wring much more from it than I can. I should also note that the program is only available for Mac — I know, I know — so if you haven’t been sucked into the Apple vortex, this post won’t be of much use to you. But my fellow Mac-owning archival researchers looking to build a digital database may find something of value in the ensuing description of the DevonThink process I’ve come to rely on over the past year.

When I fire up the program and open my Dissertation database, I’m met with the menu you see below to the left. At the top are a few items: Inbox, the default repository for new files I drag into the program; Tags, which I don’t really use; Mobile Sync, a reception point for items that come in through the DevonThink ToGo mobile app, and Evernote, which receives clippings I make with the Evernote app. (You’ll find a bit more on these last two at the bottom of the post). All of these came with the program or with apps I connected to it, as did the four items at the bottom of the list (i.e., All Images, All PDF Documents, Duplicates, and Orphaned Files). The stuff in between, though, is user-generated.

The home screen.

The header labeled Archives is where I put the documents I scan and the notes I take on them, organized by country and then by archive. Books/Articles is where I take notes on secondary sources; it’s also organized geographically. For Others are documents unrelated to my own project that may be of interest to friends and colleagues. Internet (Clippings/Links) is where I sort stray news articles and websites of interest. Logistics is home to information about the infrastructure of academic life — fellowships, grants, conference funding, seminars, and the like. Notebook is where I take notes and organize documents in ways that cut across multiple archives. Random/Interesting is self-explanatory, and Teaching Aids are where I put things that may be helpful for teaching all of this when I’m back home.

When I’m at the archive itself, the Archives header is, unsurprisingly, where most of the action is. Let’s imagine I’m spending the day at Argentina’s National Library. The “Biblioteca Nacional” folder has three subfolders, which correspond to the three divisions I’ve used so far: Archivo, Historia Oral, and Libros. As I work through an archival collection, I’ll create a subheader for the collection, then one for each of its boxes that I consult, and finally for each archival folder of interest.

Let’s say I’m working with the Silvio Frondizi Subcollection, on which my recent post, Revolutionary Human Rights, was based. More specifically, I’m looking through a folder from Box 7, labeled “Movimiento Nacional contra la Represión y la Tortura 1/2” (see below). When I come across a document I want to take note of, I’ll create a Rich Text File (RTF) in the corresponding folder, titled first with the date as closely as I know or can approximate it, and then either its title or a phrase that more effectively conveys its use. (I mark my own date approximations with question marks.) In the body of the RTF, I’ll include any document or page numbers that I may need for later citation followed by whatever thoughts have come into my mind. In cases where I have general observations about a folder, or a box, or an entire archival collection, I’ll create a separate RTF file in the corresponding place in the database titled “0 Overall” and take notes there. (The initial 0 is a way to make sure the file jumps to the top of the alpha-numeric heap.)

The contents of the folder, “Movimiento Nacional contra la Represión y la Tortura 1/2.” At right, above the line, an alphabetized list of the files it contains. Below it, a space to scroll through them. At left, nesting drop-down menus organized by Country, then Archive, Collection, Box, and Folder.

If a document is worth copying and I am permitted to take photos, I’ll scan it with my cell phone and convert it into an OCR-recognized PDF, which I will then label with the same name as the related RTF I’ve just created in DevonThink. Then, when I get home, I can upload the PDFs from my phone and easily sort them into their corresponding DevonThink folders. As a final step, I’ll right-click on the PDF, choose “Copy Item Link,” and paste a permalink to the PDF into the RTF (see below). That turns the RTF into an all-purpose base of operations, which I can then use as a building block for subsequent indexing.

Copying the item link to the PDF for “1972? Ellos son torturados….” I will then paste this permalink into the identically named RTF.

What kind of indexing? Sometimes an archival collection is already organized in ways that make sense for my research. The Silvio Frondizi Subcollection, for instance, groups documents chronologically and by organization or project, which is exactly how I want them. On the level of the collection itself, then, there’s no need for further reshuffling.

But other collections aren’t arranged in ways that are helpful to my work. This is particularly true of police and military archives, which typically operate through master indexes of names but are physically organized into vast collections based on other considerations, such as reporting unit or jurisdiction. I want to preserve this original system of organization, both because I will need to specify where I found the documents that I ultimately cite, and because each security organ’s proprietary system is a window onto the repressive logics I am trying to understand. But relying exclusively on these original systems would greatly hobble my ability to draw connections across the archive and to conceptualize it in ways that correspond to my arguments.

In these cases, I create archive-specific indexes that meet my own thematic needs. Take the political police files held at the Arquivo Público do Estado de São Paulo (APESP), where I worked for hundreds of hours from March till May, and which I drew on for this earlier post about torture and São Paulo’s armed Left. After finishing at APESP, I created an RTF titled “0 APESP Index.” The index features a couple dozen topics grouped under five major headings: Police/Military, Armed Groups, Anti-Torture/Human Rights Groups/Campaigns, Links to Other Countries, and Torture Topics. Within each of these categories, I added as many subheadings as necessary — phrases like “Resisting Torture” and “Testimonies” in the case of the “Torture Topics” grouping, for instance. I then went through the full list of RTF files that I created at APESP one-by-one, right clicking, copying each of their item links, and pasting these links into the “0 APESP Index” RTF in whatever slots seemed right (see below). Helpfully, even if I move the linked RTFs around, or modify their content or titles, the links will still work.

The APESP index, at right. To the left, the organizational system used by São Paulo’s political police.

(Because I take reasonably thorough notes while in the archive, holding a future index of just this sort in mind, the whole indexing process is quite a bit less arduous than it might sound. In this instance, it took about an hour and a half to catalogue the 150-or-so PDF files that I’d created at APESP. To my mind, it’s a worthwhile investment given the organizational and analytic power it unlocks. To be fair, though, this sort of stuff is fun to me to an extent that sometimes even I find disturbing.)

Archive-specific indexes aren’t the only sort I use DevonThink to build. The second kind are the thematic indexes which fill the Notebook portion of my database. Here, I keep running compilations of links to documents that I come across related to specific organizations, individuals, places, or themes. For instance, the armed Peronist group Montoneros is of particular interest. When I come across a document that pertains to this group, I copy-and-paste its item link into the “Montoneros” RTF in my Notebook (see below). It is my hope that, as I move into the writing stage, these indexes will serve as proto-outlines and also help me with the macro organization of the dissertation and subsidiary articles.

A thematic index, for Argentina’s Montoneros.

This description of my process hasn’t touched on many of the features that set DevonThink apart, so allow me to mention them briefly. With DevonThink, you can:

  • — Import photos and merge them instantly into multi-page PDFs, which can then be OCR-converted
  • — Take notes on documents and PDFs
  • — “Replicate” files so that identical copies sit in various places at once, yet an edit to any is an edit to all
  • — Sync to your phone or tablet using DevonThink ToGo (a product which which I’m less satisfied than with DevonThink Pro Office)
  • — Import directly from EverNote (which has far better web-clipping capabilities than DevonThink ToGo)
  • — Develop customized workflows using Automator
  • — Create “smart groups” based on tags, keywords, or full text
  • — Enjoy powerful search functionality including concordance

This last feature alone is, to me, worth DevonThink’s purchase price. While no OCR is perfectly searchable, on net it works pretty well, especially when supplemented by the keyword-driven notes I take in the linked RTFs. The result is that when I have only the inkling of a document in mind, I can almost always find it quickly.  Full-database searches, moreover, at times yield parallels and connections that I wouldn’t have anticipated. I’d never create a thematic index without doing one first.

In closing, I want to stress that the process I’ve described here is not something I could have created whole-cloth at the outset — even having consulted the numerous academy-specific posts I found online. (Though this one in particular, by historian Rachel Leow, did serve as an extremely helpful jumping-off point.) Rather it’s a method that could only have grown, trial-and-error style, out of my intensive use of the program during a sustained period of primary research, and I’m sure it will continue to change as my work advances. If you end up going the DevonThink route, I’m sure your system will look different than mine; indeed, that’s the idea!

I hope these words and screenshots prove useful to someone. If you’re that person, or if you have any questions or have found anything here to be unclear, please do let me know!

Digital Resources for Study of the Argentine Left

One of the greatest aids to my research in Argentina has been the remarkable set of digital repositories devoted to the history of the Argentine Left. Allow me to share three particularly exciting sites, in the hope that they may be of use to other researchers:

Ruinas DigitalesA project of a group of political science students from the Universidad de Buenos Aires, Ruinas Digitales is the online home of all things left-Peronist, plus a bunch of other stuff, too. In addition to collections of Mundo Peronista and Evita Montonera, you’ll find manuals and communiques from the last dictatorship, human rights reports from the late 1970s, and copies of the 1968-73 magazine, Antropolgía del Tercer Mundo, among many other finds. Any student of 20th-century Argentina will want to check it out.

El Topo BlindadoAn online archive devoted to the armed Left, this site goes beyond the well-known Montoneros and Ejército Revolucionario del Pueblo to cover lesser-known groups including the Ejército Guerrillero del Pueblo and the Guerrilla del Ejército Libertador. Particularly interesting are the collections of documents from right-wing insurgent groups (it’s hard not to appreciate the directness of the Liga por los Derechos del Hombre No Judío) and from organizations working in exile. With dozens of groups represented, it’s a spectacular source for political history across the spectrum.

Fundación PlumaMore specialized than the previous two sites, Fundación Pluma collects and diffuses the documentary history of the Trotskyist groups centered on the figure of Nahuel Moreno. What Pluma may lose in breadth it more than makes up for in depth, with nearly 10,000 documents from the mid-20th century through the present. (A quick glance at the 2000+ subjects covered gives a sense of the collection’s remarkable extent.) Viewing and downloading documents requires registration, but this is free and easy to do.

If you’ve come across other digital repositories that have proven useful, drop a comment and let us know!

ABBYY FineScanner for Free!

A brief follow-up to last week’s post on my love affair with ABBYY FineScanner: I’ve just received 10 promo codes good for unlimited access to ABBYY FineScanner Pro, the app on which I’ve come to rely for document capture and recognition. Five are for iOS and five for Android. They’re each worth about $60 and must be redeemed in the next 28 days. I’d hate for them to go to waste, so if you’d like one, leave a comment below or on Facebook or drop a note via this contact form. Please include your email address and specify iOS or Android. I’ll pass along a code to the first five users of each platform who get in touch.

ABBYY FineScanner for Archival Research

Since starting grad school, I’ve tried out — and cast aside — quite a few tools for reproducing and organizing archival documents. Digital cameras, portable scanners, FileMaker, EverNote, RefWorks — these are just some of the detritus lining the long and winding research road I’ve traveled these past four (!!) years. Now, at the halfway point of my time in Brazil, I can finally say that I have found my logistical footing. Three pieces of software have emerged as the pillars of my archival process: ABBYY FineScanner for document capture, DevonThink Pro Office for organization and note-taking, and Zotero for secondary bibliography.

While Zotero is both widely used and straightforward, the other two programs may be less familiar. With the Northern-summer research season fast upon us, I’d like to share my experience with these pieces of software, in the hopes of saving researchers at earlier stages of their projects from the hassles of constant platform-shifting that have plagued mine. In this post, I’ll talk a bit about document capture; in a later installment, I’ll describe my approach to organization and note-taking.

For my first three years of grad school, archival document capture meant taking digital photos of individual pages. I’d also photograph the boxes and folders that contained these documents, in the order that I reviewed them, all the while taking notes in an archive-specific word document. The result would be two interlinked narratives of my research, one textual and the other photographic, which I could then draw upon to assemble the individual photos into whole-document pdfs.

This approach carried a single distinct advantage: it enabled me to copy large quantities of documents quickly. But the downsides were massive. For one, no matter how hard I tried, I’d consistently wind up with about one unreadably blurry picture in 50. Even in the best photo-quality case, the process of turning images of individual pages into pdfs was painstaking and extraordinarily time-consuming. Indeed, I still have a large backlog of archival photos awaiting such processing, months or even years after they were taken. Finally, the resulting pdfs were quite large, even when I reduced the component photos — and sending them through a time-consuming OCR converter made them bigger still. Even as I’d find ways to tweak this process, my fundamental dissatisfaction remained.

All the while, I knew from historian friends that there was another way: I could turn my phone into a handheld scanner using one of the many apps on the market. Even with this knowledge, however, for years I found reasons to resist making the switch. I valued the flexibility and speed of digital photos, I though. Wouldn’t scanning on the spot slow me down? Plus, I’d invested a bunch of money in a nice digital camera; was I really going to abandon it in favor of my phone’s lower-quality one?

In January, though, my first smartphone went caput. When its replacement impressed me with its higher photo quality, I realized that it was finally time to give scanning apps a chance. I dug up this PC Mag breakdown of the major options, and ABBYY FineScanner immediately caught my eye. The full-featured version is pricey — on the order of $20 for a year or $60 for a lifetime of OCR-equipped scanning — but I’d used ABBYY products extensively in Columbia’s Digital Humanities Center and had always been satisfied. So I decided to take the plunge.

By the end of my first day using the app in the archive, I knew that my research life would never be the same. There could be no doubt, of course, that FineScanner makes for a much slower capture process than simple photo-taking. The basic trick is this: you take a photo of a page, and no matter the angle of the photo, the app will identify and crop the document into a flat, undistorted image. This is what an original photo looks like at the cropping stage:

While this “autocrop” functionality works pretty well, I still need to double-check every page, and in the end I have to manually crop quite a lot of them (depending on the document, the proportion ranges from 10 to 100% of the pages I scan). Then, in order to turn the photos into searchable pdfs, I upload them to the ABBYY server — something doable either between document scans at the archive or once I get home, depending on the size of the documents and the pace of the day.

These slight hassles, though, are vastly outweighed by the utility of the final product. FineScanner is able to recognize nearly all of the documents I send it, meaning that the individual photos I take come back to me as completely searchable, centered and undistorted pdfs, which I can then upload directly to the cloud or send via email or text. Here’s the page from above, post-processing:

The app can recognize nearly 200 languages, and while I can’t vouch for most of them, my experiences with English, Portuguese, Spanish, and French have been excellent. And miraculously, the recognized pdfs that emerge are tiny — generally about 10% of the size of the smallest unrecognized pdfs I used to make. (Fifty-page pdfs, for instance, usually weigh in around 2.5 MB.)

The advantage of the OCR phone-scanning approach is clearest in comparison. Whereas before, I would end a day at the archive with several hundred individual photographs and the dreadful knowledge that hours of processing awaited at some sure-to-be-later time, now my days end with tiny, fully-searchable pdfs ready to be organized and consulted on demand.

If you’ve made it to the end of this post, there’s a good chance that you belong to the tiny minority of people whose lives can be changed by a well-crafted OCR-optimized portable-scanning smartphone app. And if indeed you do, I’d recommend giving ABBYY FineScanner a spin.

Fulbright-Hays Season

Instead of writing about history, today I’d like to share a few thoughts about those terrible, wonderful things that enable the writing of history: fellowships and grants. Or one of them, at least: the Fulbright-Hays Doctoral Dissertation Research Abroad program, for which applications are due March 14 (or earlier on some campuses — do check!). The DDRA, which funds six to twelve months of area-studies research abroad, is one of the stranger grants out there. It relies on year-to-year approval from Congress, wreaking havoc on the standard application timeline (and making it particularly vulnerable to Trump and the Congressional GOP). The process is demanding and byzantine, with a seemingly interminable list of requirements and an online interface that is cumbersome at best. But at the same time, the evaluation procedure is unusually transparent, and it works differently from many other grants, in ways that may prove particularly advantageous to some. So in this spirit of transparency, I’d like to share a few thoughts derived from my own DDRA experience, in the hope that they can be of use to other applicants.

The Fulbright-Hays DDRA, I should say off the bat, is the grant that is currently supporting me here in Brazil (where I arrived two weeks ago), and which will be taking me to Argentina in August. This fortunate turn of events very nearly wasn’t to be, however; I almost didn’t apply at all. Indeed, last year’s early May deadline found me at the lowest point of my doctoral experience. The infinite time-sump of grant season had, to that point, yielded only stress and disappointment. The DDRA seemed an even less likely prospect than the various funders who had already rejected my proposals. The application guidelines, after all, state that “awards are not made to applicants planning to conduct research on topics that are determined to be politically sensitive… by the U.S. Embassy or Fulbright Commission in the host country.” A history of systematic state torture in which neither the U.S. nor the host countries end up looking very good, struck me as the essence of political sensitivity — a suspicion that only seemed to be confirmed through communication with the program’s director. With life-eating Orals just weeks away, late April seemed a particularly improvident time to be chasing waterfalls.

Fortunately, though, my advisors urged me to apply anyway. Armed with advice from past DDRA fellows Rachel Grace Newman and Jennifer Adair, I decided to take the plunge — and wow am I glad that I did. Not only was my project not disqualified, but in the end it benefitted tremendously from the DDRA’s uncommon evaluation system.

How so, you ask? Most other funders make awards by committee; in other words, a group of scholars meets to decide collectively who among the finalists will receive a grant. But the Fulbright-Hays works differently. According to the DDRA system, independent evaluators award a total of up to 100 points across ten predetermined categories, six of which concern the project proposal itself, and four, the qualifications of the applicant. Projects that utilize one of 78 “priority languages” (Portuguese is one) are given three bonus points; those in certain preferred academic fields (history, unsurprisingly, not among them) get two more. The scores are summed, and awards are made.

Why might this system benefit some projects more than others? Allow me to extrapolate from my own experience. Last year I noticed a pattern in the reviewer feedback I received from unsuccessful grant applications: it tended to be quite polarized. Generally, one reviewer in particular would express serious reservations about my project (perhaps a grant-world variation of the dreaded Reviewer 2?) This was true for my DDRA evaluations as well; one reader awarded me all possible points, while another gave me a score that I suspect was at the far lower end of those earned by grant recipients. I am only speculating here, but it’s my hunch that committees are less likely to award grants to proposals with fervent detractors, even if they also have strong support. I don’t think it’s a stretch to imagine, then, that the DDRA’s distinctive evaluation structure worked to my advantage — and could similarly favor others pursuing research on divisive topics. If you’re with me in this boat, then, I’d super-duper encourage you to apply.

The DDRA’s unusual point system also carries another advantage: it allows you to structure your application explicitly around the rubric that evaluators will use. (This document, called the “Technical Review Form,” is available to applicants through the G5 portal; I’ve embedded last year’s at the bottom of this post.) It’s always a good idea to tailor grant applications to funders’ interests, of course; but having such a clear readers’ view enables an unusually snug fit. At past fellows’ urging, I built my narrative statement in direct response to the enumerated criteria, going in sequential order to ensure that reviewers would never have to search for the information relevant to each point. This may seem obvious, but it’s something I would not have done quite so explicitly had past fellows not insisted. So to all potential DDRA applicants out there, let me add my voice to the chorus: may your narrative, and the technical review form, be as one.

Applying for grants is one of the least pleasant aspects of graduate school. Indeed, for me at least, the quest for research funds has been by far the single greatest source of stress in my nascent academic career. And this is to say nothing of the socioeconomic exclusions that the system only serves to reinforce. Let’s do what we can to ameliorate this situation by speaking openly about our challenges and by sharing as broadly as possible the insights that our fortuities afford.

FY16_Technical_Review_Form

The Amazing Generosity of Academic Research in Brazil

More than two weeks have passed since I arrived in São Paulo, marking this afternoon as high time for the first of what I promised would be many updates punctuating a year and a half of primary research abroad. Fortunately for the world’s stock of optimism, today’s brief note brims with nothing but positivity for what I have experienced so far here in Brazil. (And with the infestation of fleas that greeted me at my first Sampa apartment rental still rather fresh in my mind, this is perhaps a stronger statement than might, on its face, be apparent.)

In fairness, I fell in love with this country on my first archival visit to Rio and São Paulo in the (northern) summer of 2014, so the joys of these two weeks have come as little surprise. And any country that brings together tropical fruit, strong coffee, and top-notch Levantine dips is running strong out of the gate.

See what I mean?
See what I mean?

Nonetheless, I have been bowled over by the truly spectacular generosity of the professors and graduate students whom I have met in this decidedly more focused phase of research. In only two weeks, I have received two invitations to lunch, one to dinner, another to a senior scholar’s birthday party, an MA thesis’ worth of bibliographic references, a standing offer for a ride to the University of Campinas (a graduate-focused university an hour and a half outside the city, with which I am affiliated here), and more emails of introduction than I can count on two hands. That these many kindnesses have been extended against the backdrop of this country’s stomach-churning, and ever-deepening, slow-motion coup has only strengthened my sense that I am, at present, exactly where I ought to be.

More soon on the research itself. For now, though, just a small expression of gratitude for the bounteous magnanimity that makes this work not simply possible, but a pleasure.