Lesson 1: Subjects and Actions

Sentences usually communicate 2 main pieces of information: 1) who is the sentence about, and 2) what did they do? You can help readers find this information using cues in your sentence structure. For example, characters (who is the sentence about?) in your sentences are most likely to be interpreted correctly when placed in the grammatical subject. Similarly, your intended action is best placed in the sentence’s verb. You can use these structural decisions to minimize the amount of energy your readers require to understand your writing.

This lesson introduces three structural reader expectations. This lesson presupposes that you understand the basic division of English sentences into subject, verb, and complement.

Principles

  1. Put actions in verbs
  2. Put characters in subjects
  3. Keep subjects near verbs

Principle 1: Put actions in verbs

Verbs are action words: they describe motion, like to explore, to examine, or to observe. Verbs can be turned into nouns, which changes the word from an action to a thing. For example, the verb to analyze can be changed into its noun form analysis. A noun that is formed from a verb like this is called a nominalization. Nominalizations are nouns that contain a hidden action. (Nominalizations can also be words other than nouns, but they’re usually nouns in scientific writing).

Here are some examples of scientific verbs and their nominalizations:

ActionNominalization
to regulateregulation
to analyzeanalysis
to occuroccurrence
to understandunderstanding
to investigateinvestigation
to delineatedelineation
to performperformance

There is nothing inherently wrong with nominalizations, but many scientific writers misuse them by using abstract nouns to convey action. This creates a disconnect between structure and meaning — the intended action is no longer found in the verb. Most readers expect the main action of a clause to be found in a verb. This is because verbs inherently convey action, and nouns do not. If you fail to put your intended action in a verb, your reader must work to determine where the action is. For example:

SentenceAction
We performed an analysis on the data nominalization
We analyzed the data. verb

What is going on in this sentence? In the first example, the verb is to perform, but the intended action is probably to analyze (hidden in the nominalization analysis). The point of this sentence probably has nothing to do with performance. But a reader of the first example has to consider this possibility (if subconsiously), while the reader of the second clearly understands the action. This is a trivial example, but the point is more important in complex sentences (see examples below).

Scientific writing regularly disguises the main actions in nouns, costing reader energy. If you overuse nominalizations, you can improve your writing by restructuring your sentences to capture actions in verbs.

Revision Technique

Go through your manuscript and underline all nominalizations. Take a closer look at these words to see if they should be changed to verbs.

Or, it may be easier to do the opposite: Go through the manuscript and underline all the verbs. For each verb, ask yourself this question: Does this verb capture the action in the sentence?

Nominalizations are sometimes useful; for example, when they summarize the action of the previous sentence. In such a case, a nominalization is a good way to form a backwards link to something already familiar to the reader. For example:

We analyzed the data. This analysis demonstrated the need for additional experiments.

Principle 2: Put characters in subjects

The character is the actor (the entity performing the action). Readers expect the main character in a clause to be found in the subject. Characters can be (and often are) abstract nouns, like expression level or exon usage.

Here are some examples. Imagine these sentences in a paragraph discussing bacteria. Here are two examples that use the subjects differently. In the first example, there is a disconnect between subject and intended main character:

The movement in the liquid medium of the bacteria was accomplished by microflagella.

 

In the second version, the content is the same, but the structure is changed. The main character is now found in the subject:

The bacteria move themselves in the liquid medium with microflagella.

 

In the first sentence, the grammatical subject was an abstract noun (movement), which is really describing the action of the main character.
The second example is clearer because the intended actor (what’s the sentence about?) is the same as the grammatical subject (bacteria).

The grammatical subject of the sentence should be the answer to the question: What is this sentence about? This principle goes hand-in-hand with the actions/verbs principle. I don’t think this is usually as big of a problem in scientific writing, and it is usually fixed at the sentence level by revising for verb-action agreement.

More importantly, science writing often has the problem of subject shifting — when subjects change erratically throughout a paragraph. It’s fine to change the grammatical subject from one sentence to the next if you intend to change the topic. But often, writers intend to discuss a particular topic for several sentences (the topic doesn’t change), but change the grammatical subjects. Writing is easier to follow when the string of subjects in a paragraph reflects the topics. Paragraph units are most effective when they either 1) discuss a single topic; or 2) discuss a series of related topics that build on one another. You can fulfill reader expectations by maintaining a logical flow of grammatical subjects in a paragraph. There are two primary ways to accomplish this:

  1. Maintain a common subject throughout a one-topic paragraph
  2. Shift the subject appropriately according to the story

In this 4-sentence paragraph, the topic and the main character are primate genome sequences. In the first example, the grammatical subject matches the topic. I’ve highlighted the subjects in bold.

To understand human evolution, genomes from related primates are necessary. For example, several primate genomes are needed to identify features common to primates or unique to humans. Fortunately, such genome-wide exploration is now a reality; in the past 5 years, genome sequences of several nonhuman primates have been released.

In this alternative example, the grammatical subjects shift, while the topic of the paragraph stays the same. This paragraph says the same thing as the previous one:

To understand human evolution, genomes from related primates are necessary. For example, identification of features common among primates or unique to humans will require several primate genomes. Fortunately, scientists can now do such genome-wide exploration; in the past 5 years, the community has released several nonhuman primate genome sequences.

Compare the subject strings:

  1. genomes from related primates … primate genomes … genome-wide exploration … genome sequences
  2. genomes from related primates … identification of features … scientists … the community

The first example is easier for a reader to understand because the subject (while not exactly the same words) is consistent and familiar throughout the paragraph. The second example shifts the subject twice, disconnecting it from the topic of the paragraph.

Sometimes it’s necessary to write explanatory paragraphs that build from one thing to the next. In this case, the subjects can shift as the topics shift. This is a common construction in scientific writing:

Technology often drives science. Among the most impressive recent technological advances is DNA sequencing. More efficient sequencing has reduced the cost of generating sequence data significantly. Cheaper data in turn enables more researchers to do data-intensive experiments, which results in a huge amount of data being released into the public domain. Dealing with data in such large quantity will require a new generation of scientists.

This subject string clearly is shifting, but it does so in an intended, logical flow that builds up to the final point of the paragraph. Each subject connects to the previous subject (or is the object of the previous sentence):
Technology… DNA sequencing… More efficient sequencing… Cheaper data… Huge amount of data… Dealing with data.

You can understand the gist of the paragraph just by reading the succession of subjects. The point of this example is to illustrate that you don’t need every paragraph to have exactly 1 topic and subject. Instead, just be aware of what your subjects are, and if they match the structure of the idea you intend to communicate.

Revision Technique

Highlight the subject of each sentence. Does the structure of your subjects match the information you intend to convey? In other words, are the subjects of the sentences jumping from one thing to another, or do they shift only when you intend to shift the topic under discussion?

Note: One problem that frequently makes scientific writing confusing is a sentence without a character; such sentences can be caused by passive voice, which can leave a reader to guess the actor (that’s a Bad Thing). More on this in the section on passive voice.

Principle 3: Keep subjects near verbs

Recall the two primary pieces of information a reader looks for:

  1. who is the sentence about?
  2. what are they doing?

When these two pieces of information are far apart, that usually means one of them isn’t arriving until the end of the sentence. This confuses readers, because they can’t piece together the whole picture without answers to these questions. In science writing, this is often caused by long, complex subjects. I find many sentences that go on and on and finally provide the verb at the end of the sentence. When this happens, readers must re-read the sentence, now that they know the action.

For example, can you understand this sentence on the first reading?

Farmers that understand the difference between the soil requirements of plants when they are seedlings and their requirements when they are mature are in high demand.

The subject Farmers is separated from the verb phrase are in high demand by 21 words. If we reduce this distance, we get a more understandable (though still not perfect) sentence:

Farmers are in high demand if they can understand the difference between the soil requirements of plants when they are seedlings and their requirements when they are mature.

 

A similar problem happens with long lists. Authors provide a long list of stuff with no context, and the verb doesn’t show up until the end of the sentence:

Peanuts, shrimp, almonds, milk or anything else with lactose, and wheat or anything with gluten all represent things that people are commonly allergic to.

 

You have no idea what you’re reading until the end. When you find out, you must re-read the sentence to comprehend what these things have in common. To revise, just give the context before the list:

People are commonly allergic to things like peanuts, shrimp….

 

Now the list can be any length without reducing understandability.

Revision Technique

Identify the main subject and its verb in your sentence. If they are far apart, rephrase the sentence to bring them closer together.

Examples

Example 1

The ABC database has been subject to different improvements, modifications, and extensions in structure and content over the years.

 

This sentence relies on nominalizations to convey action. The awkward verb of the sentence (“has been subject to”) is basically meaningless; the authors likely intended to convey action in the words improvement, modification, and extension. But these are all nominalizations. By converting these into verbs, we get a much clearer sentence, and eliminate “has been subject to”:

The ABC database has been improved, modified, and extended in both structure and content over the years.

To clarify even further: doesn’t improved imply modified? Possibly it even implies extended. To strip it to exactly what you mean, what about this?

The curators have improved the structure and content of the ABC database.

Example 2

Mapping of open chromatin regions, post-translational histone modifications and DNA methylation across a whole genome is now feasible, and new non-coding RNAs can be sensitively identified via RNA sequencing.

This sentence presents a list before providing a context for it. This is apparent in the distance before we get to the verb “is feasible.” Another problem is that the main action of this sentence is contained in the nominalization “mapping.” Here’s one possible revision:

It is now feasible to map open chromatin regions, post-translational histone modifications and DNA methylation across a whole genome, and to sensitively identify new non-coding RNAs via RNA sequencing.

This revision is much easier to understand, though I would consider dividing this sentence into two because it seems to be trying to explain two unrelated things.

Example 3

Significant positive correlations were evident between the substitution rate and a nucleosome score from resting human T-cells.

 

This sentence relies on a nominalization (correlation) to convey action. I don’t think the intended action is “were evident,” which is the verb. A possible revision:

In resting human T-cells, the substitution rate correlated with a nucleosome score.

Example 4

The possibility that some termini have a base composition different from that of DNA simply because they are the nearest neighbors of termini specifically recognized by the enzymes can be checked by comparing the experimental results with those expected from the nearest neighbor data.

This sentence suffers from an extreme case of subject-verb separation. Really, this is indicative of a subject that is far too complex, but we can solve the problem by trying to bring subject and verb closer together. The main (simple) subject of the sentence is possibility, and the main verb that conveys the action of this subject is can be checked. Just highlighting these shows how far apart they are:

The possibility that some termini have a base composition different from that of DNA simply because they are the nearest neighbors of termini specifically recognized by the enzymes can be checked by comparing the experimental results with those expected from the nearest neighbor data.

Rephrasing to bring possibility and check nearer:

If we compare the experimental results with those expected from the nearest neighbor data, we can check the possibility that some termini have a base composition different from that of DNA simply because they are the nearest neighbors of termini specifically recognized by the enzymes.

And with a few more nuanced changes, we get a more readable explanation:

If we compare our expectations with experimental results, we identify any termini that differ in base composition simply because they are the nearest neighbors of those specifically recognized by the enzymes.

Worksheet

lesson1.pdf

Continue to lesson 2