My current researches mainly
focus on (1) developing supervised dimension reduction methods which help
exploring and visualizing high-dimensional data; (2) building directed
graphical models based on structural equations; (3) defining R2
for models beyond (homoscedastic) linear regression models. Although I am
interested in addressing statistical issues in general data science, most of
my current researches are motivated by analyzing data from
whole-genome/sequencing-based association studies,
whole-genome/sequencing-based animal/plant selection, eQTL
mapping, and gene-gene/gene-environment interaction studies.
Analysis, Empirical Likelihood Approach, Exploratory Data Analysis,
Graphical Models, Multivariate Extreme Values, Multivariate Statistics,
Supervised Dimension Reduction, Variable Selection for Large p Small n Data
M. Ren, and D. Zhang (2018) Differential
Analysis of Directed Network. Proceedings of
the 34th Conference on Uncertainty in Artificial Intelligence
C. Chen, M. Ren, M. Zhang and D. Zhang (2017)
Two-stage penalized least squares method for constructing large systems
of structural equations. Accepted by Journal
of Machine Learning Research. arXiv:1511.00370.
R Package: BigSEM.
D. Zhang (2017). A coefficient of
determination for generalized linear models. The American Statistician, 71(4): 310-316. R Package: rsq. SAS Macro: RsquareV.
V. Pungpapong, M. Zhang and D. Zhang (2015).
Selecting massive variables using an iterative conditional modes/medians
algorithm. The Electronic Journal
of Statistics, 9, 1243-1266.
Y. Lin, M. Zhang and D. Zhang (2015).
Generalized orthogonal components regression for high dimensional
generalized linear models. Computational
Statistics & Data Analysis, 88, 119-127.
M. T. Wells and D. Zhang (2011). Graphical
models for clustered binary and continuous responses. In Advances in
Directional and Linear Statistics (edited by M.T. Wells and A. SenGupta),
305-321, Springer-Verlag Berlin Heidelberg.
N.-H. Chan, L. Peng and D. Zhang (2010).
Empirical-likelihood-based confidence intervals for conditional variance
in heteroscedastic regression models. Econometric Theory, 27:
M. Zhang, D. Zhang and M.T. Wells (2010). Generalized
thresholding estimators for high-dimensional location parameters. Statistica Sinica,
D. Zhang, Y. Lin and M. Zhang (2009). Penalized
orthogonal-components regression for large p small n data. Electronic
Journal of Statistics, 3:
781-796. R Package: POCRE
D. Zhang, M. T. Wells and L. Peng (2008). Nonparametric
estimation of the dependence function for a multivariate extreme value
distribution. Journal of Multivariate Analysis, 99: 577-588.
D. Zhang, M. T. Wells, B. W. Turnbull, D.
Sparrow and P. A. Cassano (2005). Hierarchical
Graphical Models: An Application to Pulmonary Function and Cholesterol
Levels in the Normative Aging Study. Journal of the American
Statistical Association, 100: 719-727.
D. Zhang, S. He and Z. Xie (1993). Outlier
Detection and Intervention for ARIMA(p,d,0). Proceedings
of First Asian Conference on Statistical Computation.
Genetics and Bioinformatics
of Gene Expression Data, Analysis of Mass Spectrometry Data, Comparative
Proteomics/Metabolomics Study, Quantitative Trait Loci Mapping, Whole-Genome/Sequencing-Based
L. Guan, Q. Wang, L. Wang, B. Wu, Y. Chen, F.
Liu, F. Ye, T. Zhang, K. Li, B. Yan, C. Lu, L. Su, G. Jin, H. Wang, H.
Tian, L. Wang, Z. Chen, Y. Wang, J. Chen, Y. Yuan, W. Cong, J. Zheng, J. Wang,
X. Xu, H. Liu, W. Xiao, C. Han, Y. Zhang, F. Jia, X. Qiao, Genetic REsearch on schizophreniA neTwork-China and Netherland (GREAT-CN), D. Zhang, M.
Zhang, H. Ma (2016). Common Variants on 17q25 and Gene-Gene Interactions
Conferring risk of Schizophrenia in Han Chinese Population and Regulating
Gene Expression in Human Brain. Molecular
Psychiatry, 2016, 1-7.
C. Chen, L. Deng, S. Wei, G. A. N. Gowda, H.
Gu, G. Chiorean, M. Zaid, M. Harrison, J.
Pekny, P. Loehrer, D. Zhang, M. Zhang, D.
Raftery (2015). Exploring Metabolic Profile Differences between
Colorectal Polyp Patients and Controls Using Seemingly Unrelated
Regression. Journal of Proteome
Research, 14: 2492-2499.
H. T. Zhang, D. Zhang, Z. G. Zha, C. D. Hu
(2014). Transcriptional activation of PRMT5 by NF-Y is required for cell
growth and negatively regulated by the PKC/c-Fos
signaling in prostate cancer cells. BBA
- Gene Regulatory Mechanisms, 1839, 1330-1340.
H. Li, Y. J. Wang, L. Hua, Y. T. Yang, M.
Zhang, D. Zhang, C. Y. Wang, and Z. Q. Xu (2013). Lack of association
between dendritic cell nuclear protein-1 gene and major depressive
disorder in the Han Chinese population. Progress in Neuro-Psychopharmacology & Biological Psychiatry,
V. Pungpapong, W. M. Muir, X. Li, D. Zhang,
and M. Zhang (2012). A fast and efficient approach for genomic selection
with high density markers. G3:
Genes, Genomes, Genetics, 2: 1179-1184.
V. Pungpapong, L. Wang, Y. Lin, D. Zhang, and
M. Zhang (2011). Genome-wide
association analysis of GAW17 data using an empirical Bayes variable
selection. BMC Proceeding, 5 (Suppl 9): S5.
L. Wang, V. Pungpapong, Y. Lin, M. Zhang, and
D. Zhang (2011). Genome-wide
case-control study in GAW17 using coalesced rare variants. BMC Proceeding, 5 (Suppl
X. Li, C. Zhu, Z. Lin, Y. Wu, D. Zhang, G.
Bai, W. Song, J. Ma, G.J. Muehlbauer, M.J. Scanlon, M. Zhang, and J. Yu
size in diploid eukaryotic species centers on the average length with a
conserved boundary. Molecular
Biology and Evolution, 28: 1901-1911.
D. Zhang (2010). Bayes and empirical Bayes
methods for spotted microarray data analysis. In Bayesian Modeling in Bioinformatics (edited by Dey, Ghosh, and Mallick).
Y. Lin, M. Zhang, L. Wang, V. Pungpapong, J.C.
Fleet, and D. Zhang (2009). Simultaneous
genome-wide association studies of anti-CCP in rheumatoid arthritis using
penalized orthogonal-components regression. BMC Proceedings, 3 (Suppl 7): S20.
M. Zhang, Y. Lin, L. Wang, V. Pungpapong, J.C.
Fleet, and D. Zhang (2009). Case-control
genome-wide association study of rheumatoid arthritis from GAW16 using
POCRE-LDA. BMC Proceedings, 3 (Suppl 7):
N. Liu, D. Zhang, and H. Zhao (2009). Genotyping
error detection in samples of unrelated individuals without replicate
genotyping. Human Heredity, 67: 154-162 (DOI:
D. Zhang, X. Huang, F.E. Regnier, and M. Zhang
correlation optimized warping algorithm for aligning GCXGC-MS data. Analytical
Chemistry, 80 (8): 2664-2671.
M. Zhang, D. Zhang, M. T., Wells (2008). Variable selection
with large p small n regression models: mapping QTL with epistasis. BMC
D. Zhang and M. Zhang (2007). Bayesian
profiling of molecular signatures to predict event times. Theoretical
Biology & Medical Modelling, 4:3, doi:10.1186/1742-4682-4-3.
D. Zhang, M. Zhang, and M. T. Wells (2006). Multiplicative
Background Correction for Spotted Microarrays to Improve Reproducibility.
Genetical Research, 87: 195-206.
M. Zhang, K. L. Montooth,
M. T. Wells, A. G. Clark and D. Zhang (2005). Mapping
Multiple Quantitative Trait Loci by Bayesian Classification. Genetics,
D. Zhang, M. T. Wells, C. D. Smart, and W. E.
Fry (2005). Bayesian
Normalization and Inference for Differential Gene Expression Data. Journal
of Computational Biology, 12: 391-406.
Complex Traits Consortium (2004). The
Collaborative Cross: A Community Resource for the Genetic Analysis of
Complex Traits. Nature Genetics, 36: 1133-1137.
of Diverse Biomedical Data
J. E. Huber, M. Darling, E. J. Francis, and D.
Zhang (2012). Impact of typical aging and Parkinson's disease on the
relationship among breath pausing, syntax, and punctuation. American Journal of Speech-Language
Pathology, 21: 368-379.
T.R. Mhyre, R. Loy, P.N. Tariot,
L.A. Profenno, K.A. Maguire-Zeiss, D. Zhang,
P.D. Coleman and H.J. Federoff (2008). Proteomic
analysis of peripheral leukocytes in Alzheimer's disease patients treated
with divalproex sodium. Neurobiology of Aging, 29: 1631-1643.
S. W. Perry, J. P. Norman, A. Litzburg, D. Zhang, S. Dewhurst and H. A. Gelbard (2005). HIV-1 Transactivator of Transcription Protein Induces
Mitochondrial Hyperpolarization and Synaptic Stress Leading to Apoptosis.
Journal of Immunology, 174: 4333-4344.
M. Zhang, X. Wang, D. Zhang, G. Xu, H. Dong,
Y. Yu and J. Han (2004). Orphanin FQ Antagonizes the Inhibition of Ca2+
Currents Induced by Mu-opioid Receptors. Journal of Molecular
Neuroscience, 25: 21-27.
are developed in MATLAB. All copyrights are retained by Dabao Zhang unless stated otherwise.
They are free to use for academic purpose with proper citation. Please contact me for any bugs and
the generalized orthogonal-component regression (GOCRE) algorithm
proposed in Lin, Zhang and Zhang (2014).
POCRE: Implement the penalized
orthogonal-component regression (POCRE) algorithm proposed in Zhang,
Lin and Zhang (2009), also
add new functions on variable screening, tuning parameter selection, and
Implement the two-dimensional correlation optimized warping algorithm
proposed in Zhang,
Huang, Regnier and Zhang (2008).
MicroBayes: Implement the approach proposed in Zhang,
Wells, Smart, and Fry (2005).
Implement the generalized empirical Bayes thresholding with Cauchy priors
proposed in Zhang, Zhang and Wells (2009).
Implement the generalized empirical Bayes thresholding with Laplace
priors which is developed in a paper in preparation (see Zhang,
Zhang and Wells, 2009 for GEBT).
Implement the Bayesian approach for QTL mapping proposed in Zhang,
Montooth, Wells, Clark and Zhang (2005),
which is extended in Zhang, Zhang, and
Wells (2008) and another paper in preparation.
Implement the EM algorithm for mixed graphical models as described in Zhang,
Wells, Turnbull, Sparrow and Cassano (2005).