mmtbx.validation package

The validation module combines all aspects of model validation, both with respect to geometry and against experimental data. Many of these were adapted from the MolProbity web server, and continue to be used for that purpose. They are also available in the Phenix GUI as a standalone program and as an accessory to phenix.refine. However, the individual analyses may also be run separately and used to guide various decision-making during model-building and refinement.

Base classes

With few exceptions, all analyses in the validation framework inherit from a common set of base classes (or use them internally). This provides a unified API for accessing similar information.

class mmtbx.validation.atom(**kwds)

Bases: atom_base, entity

Base class for validation results for a single atom. This is distinct from the atom_info class above, which is used to track individual atoms within a multi-atom validation result.

altloc
atom_selection
b_iso
chain_id
element
icode
is_single_residue_object()
model_id
name
occupancy
outlier
resname
resseq
score
segid
symop
xyz
class mmtbx.validation.atom_base(**kwds)

Bases: slots_getstate_setstate

Container for metadata for a single atom, in the context of validation results involving multiple atoms. Intended to be used as-is inside various atoms classes.

atom_group_id_str()
id_str(ignore_altloc=False, ignore_segid=False)
property resid
residue_group_id_str()
class mmtbx.validation.atom_info(**kwds)

Bases: atom_base

Container for metadata for a single atom, in the context of validation results involving multiple atoms. Intended to be used as-is inside various atoms classes.

altloc
b_iso
chain_id
element
icode
model_id
name
occupancy
resname
resseq
segid
symop
xyz
class mmtbx.validation.atoms(**kwds)

Bases: entity

Base class for validation results involving a specific set of atoms, such as covalent geometry restraints, clashes, etc.

atom_selection
atoms_info
get_altloc()
is_in_chain(chain_id)
is_single_residue_object()
merge_two_dicts(x, y)

Given two dictionaries, merge them into a new dict as a shallow copy, for json output.

n_atoms()
nest_dict(level_list, upper_dict)
outlier
score
sites_cart()
xyz
class mmtbx.validation.dummy_validation

Bases: object

Placeholder for cases where values may be undefined because of molecule type (e.g. all-RNA structures) but we want to substitute None automatically.

class mmtbx.validation.entity(**kwds)

Bases: slots_getstate_setstate

Base class for all validation results. This includes a boolean outlier flag, the information used to zoom in the Phenix GUI (optional, but strongly recommended), and some kind of numerical score (also optional, but strongly recommended - although some analyses may require multiple distinct scores).

as_JSON()

Returns a (empty) JSON object representing a single validation object. Should be overwritten by each validation script to actually output the data.

as_kinemage()

Returns a kinemage string for displaying an outlier.

as_list()

Optional; returns old format used by some tools in mmtbx.validation.

as_selection_string()

Returns PDB atom selection string for the atom(s) involved.

as_string(prefix='')
as_table_row_molprobity()

Returns a list of formatted table cells for display by MolProbity.

as_table_row_phenix()

Returns a list of formatted table cells for display by Phenix.

atom_selection
format_old()
format_score(replace_none_with='None')
static header()

Format for header in result listings.

id_str(ignore_altloc=None)

Returns a formatted (probably fixed-width) string describing the molecular entity being validation, independent of the analysis type.

is_outlier()
is_single_residue_object()
molprobity_table_labels = []
outlier
score
score_format = '%s'
xyz
zoom_info()

Returns data needed to zoom/recenter the graphics programs from the Phenix GUI.

mmtbx.validation.get_atoms_info(pdb_atoms, iselection, use_segids_in_place_of_chainids=False)
class mmtbx.validation.residue(**kwds)

Bases: entity

Base class for validation information about a single residue, which depending on context could mean either any one of the residue_group, atom_group, or residue objects from the PDB hierarchy.

altloc
assert_all_attributes_defined()
atom_group_id_str()
atom_selection
atom_selection_string()
chain_id
icode
id_str(ignore_altloc=False)

Returns a formatted (probably fixed-width) string describing the molecular entity being validation, independent of the analysis type.

is_same_residue(other, ignore_altloc=False)
is_same_residue_group(other)
is_single_residue_object()
nest_dict(level_list, upper_dict)
occupancy
outlier
property resid
residue_group_id_str()
residue_id(ignore_altloc=False)
resname
resseq
resseq_as_int()
score
segid
set_coordinates_from_hierarchy(pdb_hierarchy, atom_selection_cache=None)
simple_id()
xyz
class mmtbx.validation.rna_geometry

Bases: validation

n_outliers
n_outliers_by_model
n_total
n_total_by_model
results
show(out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, prefix='  ', verbose=True)
class mmtbx.validation.test_utils

Bases: object

count_dict_values(count_key, c=0)

for counting hierarchical values for testing hierarchical jsons

class mmtbx.validation.validation

Bases: slots_getstate_setstate

Container for a set of results from a single analysis (rotamers, clashes, etc.). This is responsible for the console display of these results and associated statistics. Individual modules will subclass this and override the unimplemented methods.

as_coot_data()

Return results in a format suitable for unpickling in Coot.

as_gui_table_data(outliers_only=True, include_zoom=False)

Format results for display in the Phenix GUI.

as_kinemage()
coot_todo()
find_atom_group(other=None, atom_group_id_str=None)

Attempt to locate a result corresponding to a given atom_group object.

find_residue(other=None, residue_id_str=None)
get_outliers_count_and_fraction()
get_outliers_fraction_for_model(model_id)
get_outliers_goal()
get_result_class()
gui_formats = []
gui_list_headers = []
iter_results(outliers_only=True)
merge_dict(a, b, path=None)

Recursive function for merging two dicts, merges b into a Mainly used to build hierarchical JSON outputs

n_outliers
n_outliers_by_model
n_total
n_total_by_model
outlier_selection()

Return a flex.size_t object containing the i_seqs of atoms flagged as outliers (either individually or as part of an atom_group). This needs to be implemented in the underlying classes unless they include a pre-built _outlier_i_seqs attribute.

output_header = None
property percent_outliers
program_description = None
results
save_table_data(file_name=None)

Save all results as a comma separated, text file

show(out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, prefix='  ', outliers_only=True, verbose=True)
show_old_output(out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, verbose=False)

For backwards compatibility with output formats of older utilities (phenix.ramalyze et al.).

show_summary(out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, prefix='')
wx_column_widths = []

Subpackages

Submodules

Nearly a dozen different analyses may be performed. Note that many of these require additional programs and/or data not distributed with CCTBX except in the context of Phenix:

  • Reduce and Probe: standalone C++ programs for adding hydrogens and analyzing atomic contacts, respectively. These are available from the Richardson lab <http://kinemage.biochem.duke.edu>_.

  • suitename: standalone C program for analyzing RNA geometry, also from the Richardson lab.

  • Geometry restraints: this can be substituted by the standard CCP4 monomer library, but Phenix includes its own set of restraints (partially derived from CCP4’s).

  • Ramachandran and rotamer distributions: these contain frequencies for each conformation based on a library of high-quality X-ray structures.

These are all essentially freely available; contact the developers if you require specific files. If you are using a Phenix installation to perform CCTBX development, you will already have access to everything necessary.

Rotalyze - protein sidechain rotamer analysis

Also available as a standalone program, phenix.rotalyze.

mmtbx.validation.rotalyze.construct_complete_sidechain(residue_group)
mmtbx.validation.rotalyze.draw_rotamer_plot(rotalyze_data, rotarama_data, residue_name, file_name, show_labels=True)
mmtbx.validation.rotalyze.evaluate_residue(residue_group, sa, r, all_dict, sites_cart=None)
mmtbx.validation.rotalyze.evaluate_rotamer(atom_group, sidechain_angles, rotamer_evaluator, rotamer_id, all_dict, outlier_threshold=0.003, sites_cart=None)
mmtbx.validation.rotalyze.get_center(residue)
mmtbx.validation.rotalyze.get_occupancy(atom_group)
mmtbx.validation.rotalyze.get_residue_key(atom_group)
mmtbx.validation.rotalyze.has_heavy_atoms(atoms)
class mmtbx.validation.rotalyze.residue_evaluator

Bases: object

evaluate_residue(residue_group)
class mmtbx.validation.rotalyze.rotalyze(pdb_hierarchy, data_version='8000', outliers_only=False, show_errors=False, use_parent=False, out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, quiet=False)

Bases: validation

as_JSON(addon_json={})
as_coot_data()

Return results in a format suitable for unpickling in Coot.

coot_todo()
data_version
display_wx_plots(parent=None, title='MolProbity - Sidechain Chi1/Chi2 plots')
evaluateScore(value, model_id='')
get_favored_goal()
get_outliers_goal()
get_plot_data(residue_name, point_type)
get_result_class()
gui_formats = ['%s', '%s', '%.2f', '%.1f', '%.1f', '%.1f', '%.1f']
gui_list_headers = ['Chain', 'Residue', 'Score', 'Chi1', 'Chi2', 'Chi3', 'Chi4']
n_allowed
n_allowed_by_model
n_favored
n_favored_by_model
n_outliers
n_outliers_by_model
n_total
n_total_by_model
out_percent
outlier_threshold
output_header = 'residue:occupancy:score%:chi1:chi2:chi3:chi4:evaluation:rotamer'
percent_allowed
percent_favored
program_description = 'Analyze protein sidechain rotamers'
results
show_summary(out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, prefix='')
wx_column_widths = [75, 120, 100, 100, 100, 100, 100]
class mmtbx.validation.rotalyze.rotamer(**kwds)

Bases: residue

Result class for protein sidechain rotamer analysis (molprobity.rotalyze).

altloc
as_JSON()

Returns a (empty) JSON object representing a single validation object. Should be overwritten by each validation script to actually output the data.

as_hierarchical_JSON()
as_string()
as_table_row_phenix()

Returns a list of formatted table cells for display by Phenix.

atom_selection
chain_id
chi_angles
evaluation
format_chi_angles(pad=False, sep=',')
format_old()
get_chi1_chi2()
static header()

Format for header in result listings.

icode
id_str_old()
incomplete
model_id
occupancy
outlier
resname
resseq
rotamer_name
score
segid
xyz
class mmtbx.validation.rotalyze.rotamer_ensemble(all_results)

Bases: residue

Container for validation results for an ensemble of residues.

altloc
as_string()
atom_selection
chain_id
chi_angles
evaluation
icode
incomplete
model_id
occupancy
outlier
resname
resseq
rotamer_frequencies()
rotamer_name
score
segid
xyz
class mmtbx.validation.rotalyze.rotamer_plot(*args, **kwds)

Bases: simple_matplotlib_plot, rotamer_plot_mixin

class mmtbx.validation.rotalyze.rotamer_plot_mixin

Bases: rotarama_plot_mixin

set_labels(y_marks=(60, 180, 300))
mmtbx.validation.rotalyze.split_rotamer_names(rotamer)

Ramalyze - Ramachandran plot analysis

Also available as a standalone program, phenix.ramalyze.

class mmtbx.validation.ramalyze.c_alpha(id_str, xyz)

Bases: slots_getstate_setstate

Container class used in the generation of kinemages.

id_str
xyz
mmtbx.validation.ramalyze.construct_complete_residues(res_group)
mmtbx.validation.ramalyze.draw_ramachandran_plot(points, rotarama_data, position_type, title, show_labels=True, markerfacecolor='white', markeredgecolor='black', show_filling=True, show_contours=True, markersize=10, point_style='bo')
mmtbx.validation.ramalyze.find_region_max_value(rama_key, phi, psi, allow_outside=False)
mmtbx.validation.ramalyze.format_ramachandran_plot_title(position_type, residue_type)
mmtbx.validation.ramalyze.get_altloc_from_id_str(id_str)
mmtbx.validation.ramalyze.get_altloc_from_three(three)
mmtbx.validation.ramalyze.get_cas_from_three(three)
mmtbx.validation.ramalyze.get_contours(position_type)

Function for determining the contours in a Ramachandran plot

Parameters:

position_type (int, defined in beginning of file (e.g. RAMA_GENERAL)) –

Returns:

  • list containing contours (2 numbers)

  • data for plotting is being “scaled” in

  • mmtbx/validation/utils.py (export_ramachandran_distribution():) – return npz ** scale_factor, # scale_factor = 0.25

  • Therefore to calculate contours we need to look at

  • mmtbx/validation/ramalyze.py (evalScore() for the logic and)

  • put the cutoff numbers to the power of 0.25

mmtbx.validation.ramalyze.get_dihedral(four_atom_list)
mmtbx.validation.ramalyze.get_favored_peaks(rama_key)

returns exact favored peaks with their score value

mmtbx.validation.ramalyze.get_favored_regions(rama_key)

Returns list of tuples (phi, psi) inside separate favorable regions on particula Ramachandran plot. It is not the best idea to use strings, but it is not clear how conviniently use constants defined in the beginning of the file.

mmtbx.validation.ramalyze.get_matching_atom_group(residue_group, altloc)
mmtbx.validation.ramalyze.get_omega_atoms(three)
mmtbx.validation.ramalyze.get_phi(prev_atoms, atoms)
mmtbx.validation.ramalyze.get_psi(atoms, next_atoms)
mmtbx.validation.ramalyze.isPrePro(residues, i)
mmtbx.validation.ramalyze.is_cis_peptide(three)
class mmtbx.validation.ramalyze.ramachandran(**kwds)

Bases: residue

Result class for protein backbone Ramachandran analysis (phenix.ramalyze).

altloc
as_JSON()

Returns a (empty) JSON object representing a single validation object. Should be overwritten by each validation script to actually output the data.

as_csv()
as_hierarchical_JSON()
as_kinemage()

Returns a kinemage string for displaying an outlier.

as_string()
as_table_row_phenix()

Returns a list of formatted table cells for display by Phenix.

atom_selection
chain_id
format_old()
static header()

Format for header in result listings.

icode
id_str_old()
markup
model_id
occupancy
outlier
phi
psi
rama_type
ramalyze_type()
res_type
residue_type()
resname
resseq
score
segid
xyz
class mmtbx.validation.ramalyze.ramachandran_ensemble(all_results)

Bases: residue

Container for results for an ensemble of residues

altloc
atom_selection
chain_id
icode
markup
model_id
occupancy
outlier
phi
phi_min_max_mean()
phi_range()
psi
psi_min_max_mean()
rama_type
res_type
resname
resseq
score
score_statistics()
segid
xyz
class mmtbx.validation.ramalyze.ramachandran_plot(*args, **kwds)

Bases: simple_matplotlib_plot, ramachandran_plot_mixin

class mmtbx.validation.ramalyze.ramachandran_plot_mixin

Bases: rotarama_plot_mixin

extent = [-180, 180, -180, 180]
set_labels(y_marks=())
class mmtbx.validation.ramalyze.ramalyze(pdb_hierarchy, outliers_only=False, show_errors=False, out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, quiet=False)

Bases: validation

Frontend for calculating Ramachandran statistics for a model. Can directly generate the corresponding plots.

add_to_validation_counts(ev, model_id='')
as_JSON(addon_json={})
as_coot_data()

Return results in a format suitable for unpickling in Coot.

as_kinemage()
as_markup_for_kinemage(c_alphas)
display_wx_plots(parent=None, title='MolProbity - Ramachandran plots')
static evalScore(resType, value)
evaluateScore(resType, value)
fav_percent
get_allowed_count_and_fraction()
get_allowed_goal()
get_cis_pro_count_and_fraction()
get_favored_count_and_fraction()
get_favored_goal()
get_general_count_and_fraction()
get_gly_count_and_fraction()
get_ileval_count_and_fraction()
get_outliers_goal()
get_phi_psi_residues_count()
get_plot_data(position_type=0, residue_name='*', point_type=3)
get_plots(show_labels=True, point_style='bo', markersize=10, markeredgecolor='black', dpi=100, markerfacecolor='white', show_filling=True, show_contours=True)

Create a dictionary of six PNG images representing the plots for each residue type. :param out: log filehandle

get_prepro_count_and_fraction()
get_result_class()
get_trans_pro_count_and_fraction()
gui_formats = ['%s', '%s', '%s', '%.2f', '%.1f', '%.1f']
gui_list_headers = ['Chain', 'Residue', 'Residue type', 'Score', 'Phi', 'Psi']
n_allowed
n_allowed_by_model
n_favored
n_favored_by_model
n_outliers
n_outliers_by_model
n_total
n_total_by_model
n_type
out_percent
output_header = 'residue:score%:phi:psi:evaluation:type'
property percent_allowed
property percent_favored
program_description = 'Analyze protein backbone ramachandran'
results
show_summary(out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, prefix='')
write_plots(plot_file_base, out, show_labels=True, point_style='bo', markersize=10, markeredgecolor='black', show_filling=True, show_contours=True, dpi=100, markerfacecolor='white')

Write a set of six PNG images representing the plots for each residue type.

Parameters:
  • plot_file_base – file name prefix

  • out – log filehandle

wx_column_widths = [75, 125, 125, 100, 125, 125]

Clashscore - all-atom contacts using Reduce and Probe

Also available as a standalone program, phenix.clashscore.

All-atom contact analysis. Requires Reduce and Probe (installed separately).

mmtbx.validation.clashscore.check_and_add_hydrogen(pdb_hierarchy=None, file_name=None, nuclear=False, keep_hydrogens=True, verbose=False, model_number=0, n_hydrogen_cut_off=0, time_limit=120, allow_multiple_models=True, crystal_symmetry=None, do_flips=False, log=None)

If no hydrogens present, force addition for clashscore calculation. Use REDUCE to add the hydrogen atoms.

Parameters:
  • pdb_hierarchy – pdb hierarchy

  • file_name (str) – pdb file name

  • nuclear (bool) – When True use nuclear cloud x-H distances and vdW radii, otherwise use electron cloud x-H distances and vdW radii

  • keep_hydrogens (bool) – when True, if there are hydrogen atoms, keep them

  • verbose (bool) – verbosity of printout

  • model_number (int) – the number of model to use

  • time_limit (int) – limit the time it takes to add hydrogen atoms

  • n_hydrogen_cut_off (int) – when number of hydrogen atoms < n_hydrogen_cut_off force keep_hydrogens tp True

  • allow_multiple_models (bool) – Allow models that contain more than one model

  • crystal_symmetry – must provide crystal symmetry when using pdb_hierarchy

Returns:

PDB string (bool): True when PDB string was updated

Return type:

(str)

mmtbx.validation.clashscore.check_and_report_reduce_failure(fb_object, input_lines, output_fname)
class mmtbx.validation.clashscore.clash(**kwds)

Bases: atoms

as_JSON()

Returns a (empty) JSON object representing a single validation object. Should be overwritten by each validation script to actually output the data.

as_hierarchical_JSON()
as_string()
as_table_row_phenix()

Returns a list of formatted table cells for display by Phenix.

atom_selection
atoms_info
format_old()
static header()

Format for header in result listings.

id_str(spacer=' ')

Returns a formatted (probably fixed-width) string describing the molecular entity being validation, independent of the analysis type.

id_str_no_atom_name()
id_str_src_atom_no_atom_name()
max_b_factor
outlier
overlap
probe_type
score
xyz
class mmtbx.validation.clashscore.clashscore(pdb_hierarchy, fast=False, condensed_probe=False, keep_hydrogens=True, nuclear=False, force_unique_chain_ids=False, time_limit=120, b_factor_cutoff=None, save_modified_hierarchy=False, verbose=False, do_flips=False, out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)

Bases: validation

as_JSON(addon_json={})
as_coot_data()

Return results in a format suitable for unpickling in Coot.

b_factor_cutoff
clash_dict
clash_dict_b_cutoff
clashscore
clashscore_b_cutoff
condensed_probe
fast
get_clashscore()
get_clashscore_b_cutoff()
get_result_class()
gui_formats = ['%s', '%s', '.3f']
gui_list_headers = ['Atom 1', 'Atom 2', 'Overlap']
list_dict
n_outliers
n_outliers_by_model
n_total
n_total_by_model
print_clashlist_old(out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)
probe_clashscore_manager
probe_file
program_description = 'Analyze clashscore for protein model'
results
show(out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, prefix='', outliers_only=None, verbose=None)
show_old_output(out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, verbose=False)

For backwards compatibility with output formats of older utilities (phenix.ramalyze et al.).

show_summary(out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, prefix='')
wx_column_widths = [150, 150, 150]
class mmtbx.validation.clashscore.condensed_probe_line_info(line, model_id='')

Bases: probe_line_info

mmtbx.validation.clashscore.decode_atom_string(atom_str, use_segids=False, model_id='')
class mmtbx.validation.clashscore.nqh_flip(**kwds)

Bases: residue

Backwards Asn/Gln/His sidechain, identified by Reduce’s hydrogen-bond network optimization.

altloc
as_string()
as_table_row_phenix()

Returns a list of formatted table cells for display by Phenix.

atom_selection
atom_selection_string()
chain_id
icode
id_str(ignore_altloc=False)

Returns a formatted (probably fixed-width) string describing the molecular entity being validation, independent of the analysis type.

occupancy
outlier
resname
resseq
score
segid
xyz
class mmtbx.validation.clashscore.nqh_flips(pdb_hierarchy)

Bases: validation

N/Q/H sidechain flips identified by Reduce.

gui_formats = ['%s', '%s']
gui_list_headers = ['Chain', 'Residue']
n_outliers
n_outliers_by_model
n_total
n_total_by_model
results
show(out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, prefix='')
wx_column_widths = [75, 220]
class mmtbx.validation.clashscore.probe_clashscore_manager(h_pdb_string, fast=False, condensed_probe=False, nuclear=False, largest_occupancy=10, b_factor_cutoff=None, use_segids=False, verbose=False, model_id='')

Bases: object

filter_dicts(new_clash_hash, new_hbond_hash)
get_condensed_clashes(lines)
process_raw_probe_output(probe_unformatted)
put_group_into_dict(line_info, clash_hash, hbond_hash)
run_probe_clashscore(pdb_string)
class mmtbx.validation.clashscore.probe_line_info(line, model_id='')

Bases: object

as_clash_obj(use_segids)
is_similar(other)
class mmtbx.validation.clashscore.raw_probe_line_info(line, model_id='')

Bases: probe_line_info

C-beta deviations

Also available as a standalone program, phenix.cbetadev.

Validation of protein geometry by analysis of the positions of C-beta sidechain atoms. Significant deviations from ideality often indicate misfit rotamers and/or necessity of mainchain movement, especially alternate conformations.

Reference:

Lovell SC, Davis IW, Arendall WB 3rd, de Bakker PI, Word JM, Prisant MG, Richardson JS, Richardson DC. Structure validation by Calpha geometry: phi,psi and Cbeta deviation. Proteins. 2003 Feb 15;50(3):437-50. http://www.ncbi.nlm.nih.gov/pubmed/12557186

class mmtbx.validation.cbetadev.calculate_ideal_and_deviation(relevant_atoms, resname)

Bases: object

deviation
dihedral
ideal
class mmtbx.validation.cbetadev.cbeta(**kwds)

Bases: residue

Result class for protein C-beta deviation analysis (phenix.cbetadev).

altloc
as_JSON()

Returns a (empty) JSON object representing a single validation object. Should be overwritten by each validation script to actually output the data.

as_bullseye_label()
as_bullseye_point()
as_hierarchical_JSON()
as_kinemage()

Returns a kinemage string for displaying an outlier.

as_string()
as_table_row_phenix()

Returns a list of formatted table cells for display by Phenix.

atom_selection
chain_id
deviation
dihedral_NABB
format_old()
static header()

Format for header in result listings.

icode
ideal_xyz
model_id
occupancy
outlier
resname
resseq
score
segid
xyz
class mmtbx.validation.cbetadev.cbetadev(pdb_hierarchy, outliers_only=False, out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, collect_ideal=False, apply_phi_psi_correction=False, display_phi_psi_correction=False, exclude_d_peptides=False, quiet=False)

Bases: validation

as_JSON(addon_json={})
as_bullseye_kinemage(pdbid='')
as_coot_data()

Return results in a format suitable for unpickling in Coot.

as_kinemage(chain_id=None)
beta_ideal
get_beta_ideal()
get_expected_count()
get_outlier_count()
get_outlier_percent()
get_result_class()
get_result_count()
get_weighted_outlier_count()
get_weighted_outlier_percent()
get_weighted_result_count()
gui_formats = ['%s', '%s', '%.3f', '%.2f']
gui_list_headers = ['Chain', 'Residue', 'Deviation', 'Angle']
n_outliers
n_outliers_by_model
n_total
n_total_by_model
new_outliers
outliers_removed
output_header = 'pdb:alt:res:chainID:resnum:dev:dihedralNABB:Occ:ALT:'
percent_outliers
program_description = 'Analyze protein sidechain C-beta deviation'
results
show_old_output(out, verbose=False, prefix='pdb')

For backwards compatibility with output formats of older utilities (phenix.ramalyze et al.).

show_summary(out, prefix='')
stats
wx_column_widths = [75, 125, 100, 100]
mmtbx.validation.cbetadev.construct_fourth(resN, resCA, resC, dist, angle, dihedral, method='NCAB')
mmtbx.validation.cbetadev.extract_atoms_from_residue_group(residue_group)

Given a residue_group object, which may or may not have multiple conformations, extract the relevant atoms for each conformer, taking into account any atoms shared between conformers. This is implemented separately from the main validation routine, which accesses the hierarchy object via the chain->conformer->residue API. Returns a list of hashes, each suitable for calling calculate_ideal_and_deviation.

mmtbx.validation.cbetadev.get_phi_psi_dict(pdb_hierarchy)
mmtbx.validation.cbetadev.idealized_calpha_angles(resname, chiral_volume=None)

mmtbx.validation.restraints module

Validation of models of any type against basic covalent geometry restraints. By default this will flag all restrained atoms deviating by more than 4 sigma from the target value.

class mmtbx.validation.restraints.angle(**kwds)

Bases: restraint

as_kinemage()

Returns a kinemage string for displaying an outlier.

atom_selection
atoms_info
delta
model
n_atoms = 3

Base class for covalent sterochemistry restraint outliers (except for planarity, which is weird and different). Unlike most of the other outlier implementations elsewhere in the validation module, the restraint outliers are printed on multiple lines to facilitate display of the atoms involved.

outlier
residual
score
sigma
target
xyz
class mmtbx.validation.restraints.angles(pdb_atoms, sites_cart, energies_sites, restraint_proxies, unit_cell, ignore_hd=True, sigma_cutoff=4.0, outliers_only=True, reverse_sort=False, use_segids_in_place_of_chainids=False)

Bases: restraint_validation

get_outliers(proxies, unit_cell, sites_cart, pdb_atoms, sigma_cutoff, outliers_only=True, use_segids_in_place_of_chainids=False)
get_result_class()
kinemage_header = '@subgroup {geom devs} dominant\n'
max
mean
min
n_outliers
n_outliers_by_model
n_total
n_total_by_model
restraint_label = 'Bond angle'
restraint_type = 'angle'
results
target
z_max
z_mean
z_min
class mmtbx.validation.restraints.bond(**kwds)

Bases: restraint

as_kinemage()

Returns a kinemage string for displaying an outlier.

as_table_row_phenix()

Values for populating ListCtrl in Phenix GUI.

atom_selection
atoms_info
delta
formate_values()
static header()

Format for header in result listings.

model
n_atoms = 2

Base class for covalent sterochemistry restraint outliers (except for planarity, which is weird and different). Unlike most of the other outlier implementations elsewhere in the validation module, the restraint outliers are printed on multiple lines to facilitate display of the atoms involved.

outlier
residual
score
sigma
slack
symop
target
xyz
class mmtbx.validation.restraints.bonds(pdb_atoms, sites_cart, energies_sites, restraint_proxies, unit_cell, ignore_hd=True, sigma_cutoff=4.0, outliers_only=True, reverse_sort=False, use_segids_in_place_of_chainids=False)

Bases: restraint_validation

get_n_total_by_model(energies_sites, sites_cart, pdb_atoms)
get_outliers(proxies, unit_cell, sites_cart, pdb_atoms, sigma_cutoff, outliers_only=True, use_segids_in_place_of_chainids=False)
get_result_class()
gui_formats = ['%s', '%s', '%.3f', '%.3f', '%.1f']
gui_list_headers = ['Atom 1', 'Atom 2', 'Ideal value', 'Model value', 'Deviation (sigmas)']
kinemage_header = '@subgroup {length devs} dominant\n'
max
mean
min
n_outliers
n_outliers_by_model
n_total
n_total_by_model
restraint_label = 'Bond length'
restraint_type = 'bond'
results
target
wx_column_widths = [150, 150, 100, 100, 180]
z_max
z_mean
z_min
class mmtbx.validation.restraints.chiralities(pdb_atoms, sites_cart, energies_sites, restraint_proxies, unit_cell, ignore_hd=True, sigma_cutoff=4.0, outliers_only=True, reverse_sort=False, use_segids_in_place_of_chainids=False)

Bases: restraint_validation

as_JSON(addon_json={})
get_outliers(proxies, unit_cell, sites_cart, pdb_atoms, sigma_cutoff, outliers_only=True, use_segids_in_place_of_chainids=False)
get_result_class()
gui_formats = ['%s', '%.3f', '%.3f', '%.1f', '%s']
gui_list_headers = ['Atoms', 'Ideal value', 'Model value', 'Deviation (sigmas)', 'Probable cause']
increment_category_counts(model_id, outlier)
kinemage_header = '@subgroup {chiral devs} dominant\n'
max
mean
min
n_chiral_by_model
n_handedness_by_model
n_outliers
n_outliers_by_model
n_pseudochiral_by_model
n_tetrahedral_by_model
n_total
n_total_by_model
restraint_label = 'Chiral volume'
restraint_type = 'chirality'
results
target
wx_column_widths = [250, 100, 100, 180, 250]
z_max
z_mean
z_min
class mmtbx.validation.restraints.chirality(**kwds)

Bases: restraint

as_kinemage()

Returns a kinemage string for displaying an outlier.

as_table_row_phenix()

Values for populating ListCtrl in Phenix GUI.

atom_selection
atoms_info
delta
is_handedness_swap()
is_pseudochiral()
model
outlier
outlier_type()
residual
score
sigma
target
xyz
class mmtbx.validation.restraints.combined(pdb_hierarchy, xray_structure, geometry_restraints_manager, ignore_hd=True, sigma_cutoff=4.0, outliers_only=True, reverse_sort=False, use_segids_in_place_of_chainids=False, cdl=None)

Bases: slots_getstate_setstate

Container for individual validations of each of the five covalent restraint classes.

angles
as_kinemage(chain_id=None)
bonds
chiralities
dihedrals
get_bonds_angles_rmsds()
planarities
show(out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, prefix='', verbose=True)
class mmtbx.validation.restraints.dihedral(**kwds)

Bases: restraint

as_kinemage()

Returns a kinemage string for displaying an outlier.

atom_selection
atoms_info
delta
model
n_atoms = 4

Base class for covalent sterochemistry restraint outliers (except for planarity, which is weird and different). Unlike most of the other outlier implementations elsewhere in the validation module, the restraint outliers are printed on multiple lines to facilitate display of the atoms involved.

outlier
residual
score
sigma
target
xyz
class mmtbx.validation.restraints.dihedrals(pdb_atoms, sites_cart, energies_sites, restraint_proxies, unit_cell, ignore_hd=True, sigma_cutoff=4.0, outliers_only=True, reverse_sort=False, use_segids_in_place_of_chainids=False)

Bases: restraint_validation

get_outliers(proxies, unit_cell, sites_cart, pdb_atoms, sigma_cutoff, outliers_only=True, use_segids_in_place_of_chainids=False)
get_result_class()
max
mean
min
n_outliers
n_outliers_by_model
n_total
n_total_by_model
restraint_label = 'Dihedral angle'
restraint_type = 'dihedral'
results
target
z_max
z_mean
z_min
mmtbx.validation.restraints.get_mean_xyz(atoms)
class mmtbx.validation.restraints.planarities(pdb_atoms, sites_cart, energies_sites, restraint_proxies, unit_cell, ignore_hd=True, sigma_cutoff=4.0, outliers_only=True, reverse_sort=False, use_segids_in_place_of_chainids=False)

Bases: restraint_validation

get_n_total_by_model(energies_sites, sites_cart, pdb_atoms)
get_outliers(proxies, unit_cell, sites_cart, pdb_atoms, sigma_cutoff, outliers_only=True, use_segids_in_place_of_chainids=False)
get_result_class()
gui_formats = ['%s', '%.3f', '%.3f', '%.1f']
gui_list_headers = ['Atoms', 'Max. delta', 'RMS(delta)', 'Deviation (sigmas)']
max
mean
min
n_outliers
n_outliers_by_model
n_total
n_total_by_model
restraint_label = 'Planar group'
restraint_type = 'planarity'
results
target
wx_column_widths = [250, 100, 100, 130]
z_max
z_mean
z_min
class mmtbx.validation.restraints.planarity(**kwds)

Bases: restraint

as_kinemage()

Returns a kinemage string for displaying an outlier.

as_table_row_phenix()

Values for populating ListCtrl in Phenix GUI.

atom_selection
atoms_info
delta_max
format_values()
static header()

Format for header in result listings.

outlier
residual
rms_deltas
score
xyz
class mmtbx.validation.restraints.restraint(**kwds)

Bases: atoms

as_JSON()

Returns a (empty) JSON object representing a single validation object. Should be overwritten by each validation script to actually output the data.

as_hierarchical_JSON()
as_string(prefix='')
as_table_row_phenix()

Values for populating ListCtrl in Phenix GUI.

atom_selection
atoms_info
delta
format_values()
static header()

Format for header in result listings.

id_str(ignore_altloc=None)

Returns a formatted (probably fixed-width) string describing the molecular entity being validation, independent of the analysis type.

kinemage_key()
model
n_atoms = None

Base class for covalent sterochemistry restraint outliers (except for planarity, which is weird and different). Unlike most of the other outlier implementations elsewhere in the validation module, the restraint outliers are printed on multiple lines to facilitate display of the atoms involved.

outlier
residual
score
sigma
target
xyz
class mmtbx.validation.restraints.restraint_validation(pdb_atoms, sites_cart, energies_sites, restraint_proxies, unit_cell, ignore_hd=True, sigma_cutoff=4.0, outliers_only=True, reverse_sort=False, use_segids_in_place_of_chainids=False)

Bases: validation

Base class for collecting information about all restraints of a certain type, including overall statistics and individual outliers.

as_JSON(addon_json={})
as_kinemage(chain_id=None)
get_n_total_by_model(energies_sites, sites_cart, pdb_atoms)
get_outliers(proxies, unit_cell, sites_cart, pdb_atoms, sigma_cutoff)
gui_formats = ['%s', '%.3f', '%.3f', '%.1f']
gui_list_headers = ['Atoms', 'Ideal value', 'Model value', 'Deviation (sigmas)']
kinemage_header = None
max
mean
min
n_outliers
n_outliers_by_model
n_total
n_total_by_model
restraint_type = None
results
show(out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, prefix='  ', verbose=True)
show_old_output(*args, **kwds)

For backwards compatibility with output formats of older utilities (phenix.ramalyze et al.).

show_summary(out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, prefix='')
target
wx_column_widths = [500, 100, 100, 180]
z_max
z_mean
z_min

mmtbx.validation.rna_validate module

class mmtbx.validation.rna_validate.rna_angle(**kwds)

Bases: atoms

as_JSON()

Returns a (empty) JSON object representing a single validation object. Should be overwritten by each validation script to actually output the data.

as_hierarchical_JSON()
as_string(prefix='')
as_table_row_phenix()

Returns a list of formatted table cells for display by Phenix.

atom_selection
atoms_info
delta
format_values()
static header()

Format for header in result listings.

id_str(spacer=' ')

Returns a formatted (probably fixed-width) string describing the molecular entity being validation, independent of the analysis type.

outlier
score
sigma
xyz
class mmtbx.validation.rna_validate.rna_angles(pdb_hierarchy, pdb_atoms, geometry_restraints_manager, outliers_only=True)

Bases: rna_geometry

as_JSON(addon_json={})
get_result_class()
gui_formats = ['%s', '%s', '%s', '%s', '%.2f']
gui_list_headers = ['Residue', 'Atom 1', 'Atom 2', 'Atom 3', 'Sigmas']
label = 'Backbone bond angles'
n_outliers
n_outliers_by_model
n_total
n_total_by_model
output_header = '#residue:atom_1:atom_2:atom_3:num_sigmas'
results
show_summary(out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, prefix='')
wx_column_widths = [160, 160, 160, 160, 160]
class mmtbx.validation.rna_validate.rna_bond(**kwds)

Bases: atoms

as_JSON()

Returns a (empty) JSON object representing a single validation object. Should be overwritten by each validation script to actually output the data.

as_hierarchical_JSON()
as_string(prefix='')
as_table_row_phenix()

Returns a list of formatted table cells for display by Phenix.

atom_selection
atoms_info
delta
format_values()
static header()

Format for header in result listings.

id_str(spacer=' ')

Returns a formatted (probably fixed-width) string describing the molecular entity being validation, independent of the analysis type.

outlier
score
sigma
xyz
class mmtbx.validation.rna_validate.rna_bonds(pdb_hierarchy, pdb_atoms, geometry_restraints_manager, outliers_only=True)

Bases: rna_geometry

as_JSON(addon_json={})
get_result_class()
gui_formats = ['%s', '%s', '%s', '%.2f']
gui_list_headers = ['Residue', 'Atom 1', 'Atom 2', 'Sigmas']
label = 'Backbone bond lenths'
n_outliers
n_outliers_by_model
n_total
n_total_by_model
output_header = '#residue:atom_1:atom_2:num_sigmas'
results
show_summary(out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, prefix='')
wx_column_widths = [160, 160, 160, 160]
class mmtbx.validation.rna_validate.rna_pucker(**kwds)

Bases: residue

Validation using pucker-specific restraints library.

Pperp_distance
altloc
as_JSON()

Returns a (empty) JSON object representing a single validation object. Should be overwritten by each validation script to actually output the data.

as_hierarchical_JSON()
as_string(prefix='')
as_table_row_phenix()

Returns a list of formatted table cells for display by Phenix.

atom_selection
chain_id
delta_angle
epsilon_angle
format_values()
static header()

Format for header in result listings.

icode
is_delta_outlier
is_epsilon_outlier
model_id
occupancy
outlier
probable_pucker
resname
resseq
score
segid
xyz
class mmtbx.validation.rna_validate.rna_pucker_ref

Bases: object

get_rna_pucker_ref(test_pucker=False)
class mmtbx.validation.rna_validate.rna_puckers(pdb_hierarchy, params=None, outliers_only=True)

Bases: rna_geometry

as_JSON(addon_json={})
get_result_class()
get_sugar_xyz_mean(residue)
gui_formats = ['%s', '%.2f', '%.2f']
gui_list_headers = ['Residue', 'Delta', 'Epsilon']
label = 'Sugar pucker'
local_altloc_from_atoms(residue_1_deoxy_ribo_atom_dict, residue_1_c1p_outbound_atom, residue_2_p_atom)
n_outliers
n_outliers_by_model
n_total
n_total_by_model
output_header = '#residue:delta_angle:is_delta_outlier:epsilon_angle:is_epsilon_outler'
pucker_dist
pucker_perp_xyz
pucker_states
results
show_summary(out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, prefix='')
wx_column_widths = [200, 200, 200]
class mmtbx.validation.rna_validate.rna_validation(pdb_hierarchy, geometry_restraints_manager=None, params=None, outliers_only=True)

Bases: slots_getstate_setstate

angles
as_JSON(addon_json={})
bonds
puckers
show(out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, prefix='', outliers_only=None, verbose=True)
show_summary(out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, prefix='')
suites

mmtbx.validation.analyze_peptides module

mmtbx.validation.analyze_peptides.analyze(pdb_hierarchy)
mmtbx.validation.analyze_peptides.get_master_phil()
mmtbx.validation.analyze_peptides.get_omega(prev_atoms, atoms)
mmtbx.validation.analyze_peptides.is_cis_peptide(prev_atoms, atoms)
mmtbx.validation.analyze_peptides.is_trans_peptide(prev_atoms, atoms)
mmtbx.validation.analyze_peptides.run(args)
mmtbx.validation.analyze_peptides.show(cis_peptides, trans_peptides, outliers, log=None, cis_only=True)
mmtbx.validation.analyze_peptides.usage()

mmtbx.validation.experimental module

Model validation against experimental data, in both real and reciprocal space. This does not actually handle any of the scaling and fmodel calculations, which are performed approximately as in model_vs_data.

class mmtbx.validation.experimental.data_statistics(fmodel, raw_data=None, n_bins=10, count_anomalous_pairs_separately=False)

Bases: slots_getstate_setstate

anomalous_flag
completeness
completeness_outer
d_max
d_max_outer
d_min
d_min_outer
info
n_free
n_free_outer
n_refl
n_refl_outer
n_refl_refine
n_refl_refine_outer
r_free
r_free_outer
r_work
r_work_outer
show(out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, prefix='')
show_summary(out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, prefix='')
twin_law
wavelength
wilson_b
mmtbx.validation.experimental.merging_and_model_statistics(f_obs, f_model, r_free_flags, unmerged_i_obs, n_bins=20, sigma_filtering=<libtbx.AutoType object>, anomalous=False, use_internal_variance=True)

Compute merging statistics - CC* in particular - together with measures of model quality using reciprocal-space data (R-factors and CCs). See Karplus & Diederichs 2012 for rationale.

class mmtbx.validation.experimental.real_space(model, fmodel, cc_min=0.8, molprobity_map_params=None)

Bases: validation

Real-space correlation calculation for residues

add_water(water=None)

Function for incorporating water results from water.py

everything
fsc
get_result_class()
gui_formats = ['%s', '%6.2f', '%4.2f', '%6.2f', '%6.2f', '%5.3f']
gui_list_headers = ['Residue', 'B_iso', 'Occupancy', '2Fo-Fc', 'Fmodel', 'CC']
n_outliers
n_outliers_by_model
n_total
n_total_by_model
other
output_header = None
overall_rsc
program_description = 'Analyze real space correlation'
protein
results
show(out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, prefix='  ', verbose=True)
show_summary(out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, prefix='')
water
wx_column_widths = [120, 120, 120, 120, 120, 120]
class mmtbx.validation.experimental.residue_real_space(**kwds)

Bases: residue

altloc
as_string()
as_table_row_phenix()

Returns a list of formatted table cells for display by Phenix.

atom_selection
b_iso
property cc
chain_id
fmodel
fofc
static header()

Format for header in result listings.

icode
occupancy
outlier
resname
resseq
score
segid
two_fofc
xyz

mmtbx.validation.graphics module

Base classes for visualization of MolProbity analysis using matplotlib.

class mmtbx.validation.graphics.multi_criterion_plot_mixin(binner, y_limits)

Bases: object

plot_range(i_bin)
class mmtbx.validation.graphics.residue_bin

Bases: slots_getstate_setstate

add_empty(n)
add_residue(residue)
get_outlier_plot_values()
get_real_space_plot_values()
get_residue_range()
get_selected(index)
labels
marks
n_res()
residues
x_values()
class mmtbx.validation.graphics.residue_binner(res_list, bin_size=100, one_chain_per_bin=False)

Bases: object

get_bin(i_bin)
get_ranges()
class mmtbx.validation.graphics.rotarama_plot_mixin

Bases: object

draw_plot(stats, title, points=None, show_labels=True, colormap='jet', contours=None, xyz=None, extent=None, y_marks=None, markerfacecolor='white', markeredgecolor='black', show_filling=True, markersize=5, point_style='bo')
extent = [0, 360, 0, 360]

mmtbx.validation.ligands module

mmtbx.validation.ligands.compare_ligands(ligand_code, hierarchy_1=None, hierarchy_2=None, pdb_file_1=None, pdb_file_2=None, max_distance_between_centers_of_mass=8.0, exclude_hydrogens=True, verbose=False, implicit_matching=False, out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)
mmtbx.validation.ligands.compare_ligands_impl(ligand, reference_ligands, max_distance_between_centers_of_mass=8.0, exclude_hydrogens=True, implicit_matching=False, verbose=False, quiet=False, raise_sorry_if_no_matching_atoms=True, out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)

Given a target ligand and a list of reference ligands, return the RMSD(s) for any ligand determined to be approximately equivalent. (Usually there will be just one of these, but this allows for alternate conformations.)

mmtbx.validation.ligands.extract_ligand_atom_group(hierarchy, ligand_code, only_segid=None)
mmtbx.validation.ligands.extract_ligand_residue(hierarchy, ligand_code)
class mmtbx.validation.ligands.ligand_validation(ligand, pdb_hierarchy, xray_structure, two_fofc_map, fofc_map, fmodel_map, reference_ligands=None, two_fofc_map_cutoff=1.5, fofc_map_cutoff=-3.0)

Bases: slots_getstate_setstate

atom_selection
b_iso_mean
cc
fofc_max
fofc_mean
fofc_min
id_str
n_below_fofc_cutoff
n_below_two_fofc_cutoff
occupancy_mean
pbss
rmsds
show(out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)
show_simple(out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, warnings=True)
two_fofc_max
two_fofc_mean
two_fofc_min
xyz_center
mmtbx.validation.ligands.show_validation_results(validations, out, verbose=True)
mmtbx.validation.ligands.validate_ligands(pdb_hierarchy, fmodel, ligand_code, only_segid=None, reference_structure=None, export_for_web=False, output_dir=None)

mmtbx.validation.model_properties module

Analysis of model properties, independent of data.

class mmtbx.validation.model_properties.atom_bfactor(**kwds)

Bases: atom_occupancy

altloc
atom_selection
b_iso
chain_id
element
icode
model_id
name
occupancy
outlier
resname
resseq
score
segid
symop
xyz
class mmtbx.validation.model_properties.atom_occupancy(**kwds)

Bases: atom

Container for single-atom occupancy outliers (usually atoms with zero occupancy).

altloc
as_string(prefix='')
as_table_row_phenix()

Returns a list of formatted table cells for display by Phenix.

atom_selection
b_iso
chain_id
element
icode
id_str()

Returns a formatted (probably fixed-width) string describing the molecular entity being validation, independent of the analysis type.

model_id
name
occupancy
outlier
resname
resseq
score
segid
symop
xyz
class mmtbx.validation.model_properties.model_statistics(pdb_hierarchy, xray_structure, all_chain_proxies=None, ignore_hd=True, ligand_selection=None)

Bases: slots_getstate_setstate

Atom statistics for the overall model, and various selections within. This does not actually contain individual outliers, which are instead held in the xray_structure_statistics objects for subsets of the model.

all
ignore_hd
property ligands
property macromolecules
n_atoms
n_hydrogens
n_models
n_nuc
n_polymer
n_protein
n_waters
show(out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, prefix='')
property water
class mmtbx.validation.model_properties.residue_bfactor(**kwds)

Bases: residue_occupancy

altloc
as_string(prefix='')
atom_selection
b_iso
chain_id
chain_type
icode
occupancy
outlier
resname
resseq
score
segid
xyz
class mmtbx.validation.model_properties.residue_occupancy(**kwds)

Bases: residue

altloc
as_string(prefix='')
as_table_row_phenix()

Returns a list of formatted table cells for display by Phenix.

atom_selection
b_iso
chain_id
chain_type
icode
occupancy
outlier
resname
resseq
score
segid
xyz
class mmtbx.validation.model_properties.xray_structure_statistics(pdb_hierarchy, xray_structure, ignore_hd=True, collect_outliers=True)

Bases: validation

Occupancy and B-factor statistics.

as_gui_table_data(property_type=None, outliers_only=True, include_zoom=False)

Format results for display in the Phenix GUI.

b_histogram
b_max
b_mean
b_min
bad_adps
different_occ
gui_formats = ['%s', '%s', '%.2f', '%.2f']
gui_list_headers = ['Atom(s)', 'Type', 'Occupancy', 'Isotropic B-factor']
iter_results(property_type=None, outliers_only=True)
n_all
n_aniso
n_aniso_h
n_atoms
n_hd
n_non_hd
n_npd
n_outliers
n_outliers_by_model
n_total
n_total_by_model
n_zero_b
n_zero_occ
o_max
o_mean
o_min
partial_occ
results
show_bad_occupancy(out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, prefix='')
show_bfactors(out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, prefix='')
show_summary(out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, prefix='')
wx_column_widths = [300, 100, 100, 200]
zero_occ

mmtbx.validation.sequence module

class mmtbx.validation.sequence.chain(chain_id, sequence, resids, chain_type, sec_str=None, resnames=None)

Bases: object

Stores information on a protein or nucleic acid chain, its alignment to the target sequence, and the coordinates of each residue. For command-line use, much of this is irrelevant. In the PHENIX GUI, much of this data is fed to the sequence/alignment viewer (wxtbx.sequence_view), which controls the graphics window(s).

extract_coordinates(pdb_chain)

Collect the coordinate of the central atom (CA or P) in each residue, padding the array with None so it matches the sequence and resid arrays.

extract_residue_groups(pdb_chain)
get_alignment(include_sec_str=False)
get_coordinates_for_alignment_range(i1, i2)
get_coordinates_for_alignment_ranges(ranges)
get_highlighted_residues()

Used for wxtbx.sequence_view to highlight mismatches, etc.

get_mean_coordinate_for_alignment_range(*args, **kwds)
get_mean_coordinate_for_alignment_ranges(*args, **kwds)
get_outliers_table()

Used in PHENIX validation GUI

iter_residue_groups(pdb_chain)
set_alignment(alignment, sequence_name, sequence_id)
show_summary(out, verbose=True)
mmtbx.validation.sequence.get_mean_coordinate(sites)
mmtbx.validation.sequence.get_sequence_n_copies(sequences, pdb_hierarchy, force_accept_composition=False, copies_from_xtriage=None, copies_from_user=<libtbx.AutoType object>, minimum_identity=0.3, assume_xtriage_copies_from_sequence_file=None, out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, nproc=1)

Utility function for reconciling the contents of the sequence file, the chains in the model, and the ASU. Returns the number of copies of the sequence file to tell Phaser are present, or raises an error if this is either ambiguous or in conflict with the search model multiplicity. This is intended to allow the user to specify any combination of inputs - for instance, given a tetrameric search model, 2 copies in the ASU, and a monomer sequence file, the ASU contains 8 copies of the sequence(s).

mmtbx.validation.sequence.get_sequence_n_copies_from_files(seq_file, pdb_file, **kwds)
mmtbx.validation.sequence.group_chains_and_sequences(seq_file, pdb_file, **kwds)
class mmtbx.validation.sequence.validation(pdb_hierarchy, sequences, params=None, log=None, nproc=<libtbx.AutoType object>, include_secondary_structure=False, extract_coordinates=False, extract_residue_groups=False, minimum_identity=0, custom_residues=[], ignore_hetatm=False)

Bases: object

align_chain(i)
get_missing_chains()
get_relative_sequence_copy_number()

Count the number of copies of each sequence within the model, used for adjusting the input settings for Phaser-MR in Phenix. This should automatically account for redundancy: only the first matching sequence is considered, and sequences which are non-unique will have a copy number of -1. Thus if we have 4 copies of a sequence, and the PDB hierarchy has 2 matching chains, the copy numbers will be [0.5, -1, -1, -1].

get_table_data()
sequence_as_cif_block(custom_residues=None)

Export sequence information as mmCIF block Version 5.0 of mmCIF/PDBx dictionary

Parameters:

custom_residues (list of str) – List of custom 3-letter residues to keep in pdbx_one_letter_sequence The 3-letter residue must exist in the model. If None, the value from self.custom_residues is used.

Returns:

cif_block

Return type:

iotbx.cif.model.block

show(out=None)

mmtbx.validation.utils module

mmtbx.validation.utils.build_name_hash(pdb_hierarchy)
mmtbx.validation.utils.decode_atom_str(atom_id)
mmtbx.validation.utils.exercise()
mmtbx.validation.utils.export_ramachandran_distribution(n_dim_table, scale_factor=0.25)

Convert a MolProbity Ramachandran distribution to a format suitable for display using matplotlib (see wxtbx.plots).

mmtbx.validation.utils.export_rotamer_distribution(n_dim_table, scale_factor=0.5)

Convert a MolProbity rotamer distribution to a format suitable for display using matplotlib (see wxtbx.plots). Will reduce dimensionality to 2 if necessary.

mmtbx.validation.utils.find_sequence_mismatches(pdb_hierarchy, sequences, assume_same_order=True, expected_sequence_identity=0.8, log=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)
mmtbx.validation.utils.get_inverted_atoms(atoms, improper=False)
mmtbx.validation.utils.get_mmtype_from_resname(resname)
mmtbx.validation.utils.get_rna_backbone_dihedrals(processed_pdb_file, geometry=None, pdb_hierarchy=None)
mmtbx.validation.utils.get_rotarama_data(residue_type=None, pos_type=None, db='rama', convert_to_numpy_array=False)
mmtbx.validation.utils.get_segid_as_chainid(chain)
mmtbx.validation.utils.match_dihedral_to_name(atoms)
mmtbx.validation.utils.molprobity_score(clashscore, rota_out, rama_fav)
Calculate the overall Molprobity score, as described here:

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2877634/?tool=pubmed http://kinemage.biochem.duke.edu/suppinfo/CASP8/methods.html

mmtbx.validation.utils.use_segids_in_place_of_chainids(hierarchy, strict=False)

mmtbx.validation.waters module

class mmtbx.validation.waters.water(**kwds)

Bases: atom

Container for information about a water atom, including electron density properties.

altloc
anom
as_string(prefix='', highlight_if_heavy=False)
as_table_row_phenix()

Returns a list of formatted table cells for display by Phenix.

atom_selection
b_iso
property cc
chain_id
element
fmodel
fofc
static header()

Format for header in result listings.

icode
id_str()

Returns a formatted (probably fixed-width) string describing the molecular entity being validation, independent of the analysis type.

is_bad_water()
is_heavy_atom()
model_id
n_hbonds
name
nearest_atom
nearest_contact
occupancy
outlier
resname
resseq
score
segid
symop
two_fofc
xyz
class mmtbx.validation.waters.waters(pdb_hierarchy, xray_structure, fmodel, distance_cutoff=4.0, collect_all=True, molprobity_map_params=None)

Bases: validation

Assess the properties of solvent atoms, including local environment and electron density.

get_result_class()
n_bad
n_heavy
n_outliers
n_outliers_by_model
n_total
n_total_by_model
results
show(out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, prefix='  ', verbose=True)
show_summary(out=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, prefix='  ')