My recent and main projects

Business

I investigated the link between Open Innovation (IO) and Circular Economy (CE), as well as the impact on Environmental Performance (EP) and Financial Performance (FP) in Brazilian enterprises. My precise aims were to determine if OI impacts CE and its indirect effects on PE and FP, as well as to study the direct consequences of the link between IO and CE on PE and FP.

After talks and changes with the sponsor, I determined that the situation required a Mediation Analysis. Thus, I attempted to estimate and test a structural model in order to achieve the end goal of the analyses: to estimate and test the parameters and their significance: OI->EC; OI->CE->EP; OI->EC->FP; CE->EP; and CE->FP; as well as the overall influence on PE and FP.

The available sample included 173 replies, and each of the four conceptions was explored, justified, and cited in the researcher’s dissertation project. The two control variables, ISO and SIZE, related to two items on the questionnaire.

The circumstances described in the study prompted the adoption of variance-based structural equation techniques (SEM). Thus, I utilized the Partial Least Squares SEM (PLS-SEM) algorithm, the most often used estimate technique in this context, and followed the steps recommended in the literature in the SmartPLS 3.3 program.

With this project I intend to investigate what influence of dynamic capabilities in the efforts of digital disruptive innovation considering the mediating effect of the intensity of knowledge in companies inserted in ecosystems of innovation and knowledge-intensive entrepreneurship.  In order to achieve the objectives pursued, a quantitative approach of exploratory nature was adopted, and from the application of a questionnaire in a sample of 273 individuals, working in companies inserted in innovation environments, the independent, mediating and dependent variables of the study were collected. The variables and constructs adopted in the research are scales previously validated in other studies: i) the scales of the independent (dynamic capacities), dependent (business performance in response to digital disruption) and mediator (digital platform capacity) are the Karimi and Walter (2015) scales; and ii) for the scale of the mediating variable (intensity of knowledge) the measurement model adopted was by Autio et al. (2000).

I adjusted a complex structural model, whose variable “Dynamic capacity” was measured by a third-order measurement model, reflective-reflective and reflective-formative, the variable “Digital platform capacity” measured by a second-order measurement model of the reflexive-reflective type, the variable “Intensity of knowledge” by a reflexive measurement model, and the variable “Process performance” by a formative measurement model. Thus, the feasible methodological path vis-a-vis the complexity of the model, not so large data sample (n = 273) and the presence of formative constructs,  was the use of the Partial Least Squares Structural Equation Modeling (PLS-SEM) technique.

In this project I participated from the beginning, through the conception of the research problem, proposal of the questionnaire, data collection and its analyses. We collect information about financial literacy (knowledge, attitude behavior); the Big Five of personality and Emotional Intelligence; each with a measurement model to be put to the test in a structural model. Thus, for each measurement model, reliability evaluation was used, because it is scales with good psychometric properties, and later, a Multiple Regression Analysis (MLR) was obtained, at the request of the sponsor.

In this project, I investigated the relationship between institutional vacuum and health problems, mediated by a lack of basic sanitation. For the analyses I had data throughout 2010-2019, being several sociodemographic variables, health and sanitation aggregated at the level of Brazilian municipalities. Together with the sponsor, we selected the main variables (guided by theory), because many met the same objective, to make up three models of formative measures: institutional void, deficit in basic sanitation and health problems.

Several methods were used to solve the problem, segregated in two main approaches: 1) process in two stages: i) I ran models of measures (PCA), by construct and per year , in this case I estimated 27 components (3 x 9 years); and ii) I used them as the variables X, M and Y in a longitudinal mediation model : a) Cross-Lagged Panel Model (Autoregressive model); b) Latent Growth Mediation Model and c) Latent Change Model;  d)  random slopes; e) latent interactions; and f) Path Analysis. Models b), c) and e) refer to reflective constructs. Model d) can be classified within the Multilevel Modeling Approaches for Longitudinal Data Approach, and include model f). In this case, I tested the mediation with longitudinal data by the classical tests (Path Analysis), model f), and in a  more elegant way: Path Analysis with intercept and/or random slopes/slopes. 2) complete process: adjusted, simultaneously, the measurement models and the structural model (mediation analysis) in a single estimate with the PLS-SEM algorithm,  considering longitudinal data: g) Evolution model; and h) Change model. Additionally, within this context, model a) can also be considered. The use of the PLS-SEM algorithm is due to the formative constructs and complexity of the resulting models. Despite simulating these two approaches, we were unsuccessful in the second, and we presented the final solution from a Path Analysis with random intercepts, and the measurement models came from 27 PCA’s (model f).

The objective of this project was to verify the impacts of the first year of the COVID-19 pandemic on Brazilian workers over 60 years of age (TB60+). In specific terms,  the sponsor wanted to describe and qualify TB60+, investigate the relationship between active aging factors and the intention to continue working, list professional opportunities and challenges experienced by TB60+ and propose a tb classification model60+.

Thus, it was proposed: 1)  a descriptive analysis  of the variables through tables and graphs; 2) relation of the variable that identifies the groups (1 = main and 2 = control) with the variables of part 1 of the questionnaire through the Chi-square Test (when the  other variable was nominal with more than two categories or ordinal) or  fisher’s exact (when the other variable was  with two categories). In the case of multiple answer variables (0 = No; 1 = Yes) Bonferroni adjustment was performed in the comparison of proportions; and 3) Test of proportions through chi-square statistics for the other variables of the questionnaire.

This project is related to the support in the delimitation of the statistical methodology employed in the research “CORPORATE FINANCIAL PERFORMANCE AND ITS RELATION TO CORPORATE SOCIAL PERFORMANCE, CORPORATE REPUTATION, ADVERTISING, ADVERTISING AND COMMUNICATION DURING THE PANDEMIC OF COVID-19”. The sponsor sent me the research project, where I had the contextualization of the problem, and in light of the type of data and problem presented, I proposed the methodology / statistical approach that would be used in the execution of the research.

In this sense, in view of the research problem, operationalization of the research variables and available data structure, and we proposed to study the hypotheses of the  project through panel data models with random effects.

The project deal with a formula for calculating the Purchasing Managers Index (PMI) for soybean meal, corn and feed through research conducted with purchasing managers of companies in the state of Minas Gerais and Goiás.

From other PMI’s indices that exist in the market, such as the Industrial PMI, Services and the Brazil PMI (Composite)  calculated by Markit Economics, my proposal was to transform the frequencies of the answers to the questions (Minor, Equal and Higher) into a standardized score with an average equal to 50 and standard deviation equal to 10. As the answers were a  natural ordination (Greater > Equal > Minor) an a priori question was the definition of the weights (w) of each class j.

In this project I sought to answer the following question: Does the Shop-in-Shop (SIS) strategy in popular sporting goods stores positively affect the purchase intention? What is the influence of this strategy on the perception of the brand image and the quality of sporting goods products? I had a sample of sporting goods consumers who were surveyed when they were buying or researching items within eleven different stores and/or were visually impacted by the SIS space.

Thus, I initially developed an Exploratory Factor Analysis (EFA) of the items to then validate the measurement model based on a Confirmatory Factor Analysis (CFA).  From the factors originated in the CFA and the profile variables, I preliminary performed  nonparametric tests (Mann-Whitney, Kruskal-Wallis and Spearman correlation) and then adjusted  a  structural model,  Structural  Equation Modeling (SEM), in order to test the hypotheses of the research. I adjusted the SEM model by two ways: i) one considering the factorial scores originated in CFA as observed variables; and ii) another estimating the expected causations influences in conjunction with the measurement model.

In this project I participate in all phases: from the conception of the idea, proposal of the measurement instrument, data collection and analysis.  In it we aimed to verify the influence of personality traits (extroversion, socialization, conscientiousness, neuroticism and openness) on the success of the projects. For this, we collected a sample of 205 individuals who answered an online questionnaire divided into three sections: 1) an instrument based on Hauck Filho et al. (2012) to evaluate the five major factors (Big Five); 2) another instrument to evaluate the success of the projects, based on Mendes and Filho (2014); and 3) a section with sociodemographic questions (age, gender, education, experience, PMI certification, function and size of the company).

In methodological terms, it was hypothesized that the Big Five instrument has robustness and good psychometric properties, given its extensive validation work since the 1990s and application in various contexts. Thus, only the internal consistency (Cronbach’s Alpha) of each of the five factors in the present sample was evaluated. In the case of the Project Success Instrument, it was noticed that the statement of the instrument was changing, and two other items were added. Thus, I opted advance in the validation of the model of measure of the Success of the projects through an Exploratory Factor Analysis (EFA) and Confirmatory Factor Analysis (CFA). After evaluating the two measurement models, preliminary, I proceeded a Correlation Analysis between the scores that originated and then, I adjusted a conceptual model through Structural Equation Modeling (SEM).

This project was somewhat interesting, as I had at hand a data structure and unusual problem. The objective was to examine the relationship between Brand Personality (PE), Brand Equity (EQ) and Brand Experience from  about 800 observations, from about 400 individuals, who answered a questionnaire with scales that aim to measure these constructs, however, in view of two brands: Colgate (1) and Coca-Cola (2).  The three scales used as a measurement model had already been used in other studies in Brazil.

In methodological terms, the examination of the hypothetical relationships  was performed in the framework of the Structural Equation Modeling (SEM), because it was latent constructs measured by manifest variables (items). In this sense, to operationalize the examination of relationships I had to adjust a 2nd order model.  Additionally, another complication was the fact that you have two answers for the same individual. As a strategy, I adjusted a model with all observations, segregating the base into two groups (Colgate + Coca-cola), however, restricting so that the estimated parameters for Colgate and Coca-Cola were equal.

This project was relatively simple: due to the configuration of the questionnaire, with many open questions, missing value and extremely small sample.  The general objective was to identify the potentialities and weaknesses of the implementation of patient safety centers (NSP) in hospital institutions.

The only hypothesis tests performed, at the request of  the sponsor himself, were in order to relate the only variable to the database (number of beds) against three others of interest.  I used the nonparametric Mann-Whitney test.

The objective of this project was to validate the Authentic Leadership Questionnaire (ALQ)  in the Rater and Self versions, i.e., the same individual answered the questionnaire twice: 1) in view of the leadership profile of his immediate boss (Rater); and 2) in view of their leadership profile (Self). In this case, it isa two-scale independent validation procedure.

To eliminate, I checked the sample  profile  through a Descriptive Analysis, and later, I validated the two scales from a Confirmatory Factor Analysis (CFA) with a more exploratory approach, as placed by Credé and Harms (2015). This same approach was used by Avolio, Wernsing and Gardner (2018), authors who proposed ALQ, when they reevaluated the original scale after 10 years.

In this project I evaluated productivity (daily milk production in liters) and working hours (weekly working time in hours) in rural dairy farms in Campo Mourão and Araruna.

To relate productivity and working hours with the other scaling variables in the database, Spearman’s Correlation Coefficient was used.  To relate productivity and working hours with nominal variables of interest with up to two categories using the  Mann-Whitney Test (MW). In the case of the relationship of productivity and the working day with the nominal variables of interest with more than two categories I used the Kruskal-Wallis nonparametric test (KW).

This project was very interesting in terms of the analyses applied. These were two databases: i) one with more than 300,000 customers from a retail store in São Paulo and ii)  another with  about 1,600 customers of these 300,000 who answered a questionnaire with four psychological scales.  The object was to understand the socio-demographic and psychological profile of those clients who donated the change of purchase.

Thus, initially, a sampling process was carried out  from the first database, in order to verify the size of the effect of the situational factors:  donation according to the value of the purchase, change, form of payment, type of beneficiary, number of donations, cashier, store, etc.  In a second moment, I joined the two databases and developed several models: binary and ordinal logit, Poisson and hierarchical Tobit.

This project was a very interesting challenge. It deal with the first steps towards building a scale (instrument of measure) on a very delicate subject, whose topics related to Misappropriation, Harassment, Corruption and Fraudulent Demonstrations. The questionnaire  consisted of 126 simple, discursive and video-choice questions. In these last two types of questions a specialist qualitatively analyzed the answer and classified it as High, Medium and Low Resilience. In the simple choices, based on several options of answers where the respondents outsourced the feeling (Angry, Thoughtful, Frustrated, Sad, etc.) about the statement, the answers were also classified as High, Medium and Low Resilience.  To further increase the challenge the sample was very small with several missing. In the end, due to the limitations of the sample, I applied several PCA to try to reduce the number of items and evaluate the most viable ones to continue in later studies.

The objective of this project was to evaluate the relationship between business process orientation (BPO) and socio-environmental management practices (PGS) in micro and small companies in the state of Rio de Janeiro (RJ). For this we have a research questionnaire that was applied in micro and small companies in RJ.  The BPO measurement instrument was removed, with due adjustments, from the literature.

Thus, I hypothesized that the two applied instruments (BPO and PGS) followed the methodological, scientific and statistical rigor necessary to make them valid and reliable instruments, and thus, eligible for the present purpose, and in this way, restricted in calculating the reliability of the factors in the sample, through the Alpha coefficient (Cronbach’s Alpha ), and advance the analyses with those factors with alpha around 0.70 or higher. After analyzing the reliability of the instruments, with the profile variables and items of the questionnaire,  I performed a Descriptive Analysis, and went to Bivariate Analyses: Spearman correlation and Kendall’s partial correlation coefficient.

In this project I analyze the degree /level of eco-innovation practices related to product, process and organizational aspects in the textile industries of Southern Brazil. For this, there was data from a descriptive and exploratory research, in which a survey was applied that collected questionnaires from respondents working in these industries. This questionnaire contained a section of respondent profile, company profile and 18 Likert questions (5 points) in which individuals indicated their agreement/disagreement about the practices of eco- innovation, sustainability and environmental responsibility existing in their company with regard to: i) organizational aspects; ii) production processes; and iii) manufactured products.

I did a Descriptive and Bivariate Analysis (Friedman Test, Kruskal-Wallis Test and Spearman Correlation) of the data, however,  before, I applied a  Principal Component Analysis with the purpose of reducing the 18 Likert questions.  The bivariate tests were followed by multiple (post hoc) comparations through   Bonferroni adjustment.

The objective of this project was to identify the variables of leadership communication in decision making and how these variables of leadership communication interfere in decision making for different types of organizations in Brazil. These objectives  were pursued in the context of two proposed conceptual models, which were evaluated after I validated four models of measures: Leadership, Communication, Decision Making and Organizational Effectiveness.

As this was an exploratory study, the strategy adopted was to apply  an Exploratory Factor Analysis (EFA), separately, in each of the research constructs, and later refine the adjustment of the four measurement models  through a Confirmatory Factor Analysis (CFA).  With the four measurement models refined through the AFC, I performed a Multigroup Analysis (Invariance Analysis),one for each measurement model, with the company sector as classification of the groups.  The two expected structural models were adjusted, refined, and  later compared.

It was a simple project, in which data collection was performed through   a questionnaire  based on the principles of the Educating City and democratic city management, treated in chapter IV of the City Statute. In the structure of the questionnaire, there were three blocks with dichotomous questions, scales and one open. The Likert scale was five points of the type “Totally Agree” to “Totally Disagree”

For data analysis, we used Descriptive Statistics (absolute and relative frequency) and  the Chi-square Test to support the interpretation of the results.

This project is an example of what DO NOT.  It aimed to evaluate the relationship between entrepreneurial characteristics d and individual microentrepreneurs of Catalão (GO) and business performance, however, to understand the entrepreneurial characteristics was used  a  research instrument applied to farmers in Rio Grande do Sul, which was adapted to the present context. We tried to save the research, which had already been carried out, putting the instrument to the test only in the context of an Exploratory Factor Analysis (EFA).  The performance variables were calculated from the average billing  (2010-2015) and average billing / number of employees, collected in the service database of SEBRAE /GO and the sites of the Entrepreneur Portal, Electoral Court of Goiás and Federal Revenue.

With the structured database, I performed: i) Descriptive analyses for characterization of the sample profile; ii) non-parametric tests (Mann-Whitney and Kruskal-Wallis) to relate the profile variables with the entrepreneurial characteristics (AFE factors) and the performance variables (average billing and average billing / number of employees); and iii) to fulfill the main objective of the research: to relate entrepreneurial characteristics with performance in the business; use  Spearman correlation.

The activities carried out in this project aim to understand, from longitudinal data, how the characteristics of the development of the environment to affect the way people move around. Retrospective data were collected during 20 years between 1992 and 2011 in the city of Rotterdam – Netherlands, from a questionnaire containing variables of: i) demographic characteristics; (ii) household; iii) mobility; iv) occupation; v) urban form. The idea is to evaluate the relationship of mobility variables with variables in an urban way, mediated by the other characteristics of individuals.

After much research the only feasible alternative to the problem posted resided in the recent methodology of Generalized Structural Equation Modeling (GSEM).  Only this technique encompasses the evaluation of mediation effects together with endogenous/exogenous variables of ordinal/multinomial measurement, that is, whose probability functions are not restricted only to normal function.  Additionally, it also had repeated measurements and potential random effects.

This project aimed to adjust  a structural model of Brand Value for private HEI from the consumer’s perspective. To reach this end, along the way, I had to validate seven models of measures: University Experience, Controlled Brand Communication, e-WOM, Co-creation, Brand Awareness, Brand Association and BRAND Value of IES.

With about 1,200 responses to a  questionnaire, I used the following strategies: i) I applied Exploratory Factor Analysis (EFA) in each of the constructs to explore the latent factors; ii) then a Confirmatory Factor Analysis (CFA) was applied.  to refine measurement models; and iii) with the factors estimated in CFA, taking the conceptual model as a starting point, we sought to adjust a structural model with Structural Equation Modeling (SEM) techniques.

I chose to adjust one to Path Analysis, with estimation of the factors and subsequent use of them as observed variables, to the detriment of  a complete model, where each factor would enter the model as a latent variable and the structural relationships between them theorized according to the conceptual model, due to parsimony.

This project I like to quote in my examples of what DO NOT TO DO. The sponsor took several questions from his head, not following any methodological criteria, especially those derived in a process of creating measurement instruments. You didn’t give another one! It didn’t look good! In it I tried to identify what are the determining variables for the success of business oriented to sharing from the perspective of the Brazilian consumer and whether there are differences in perception between gender, age and income. We used a questionnaire with 30 questions with the answers suggested in the items of the questionnaire not following any standardization of answer or specific rule, were elaborated in order to capture the direction (favoring or disfavoring) in relation to what was proposed to measure in a somewhat random way.

I tried to save the project from an Exploratory Factor Analysis (EFA) to obtain estimates of the factors.  With the factors originated in the EFE, I evaluated their reliability through Cronbach’s Alpha (α). Later, in the case of accepted reliability, examine how  the factors are related through the Friedman Test, since, by definition, even if measured at a scalar level, factors/constructs are natural orderings from a single related sample. Next, I tested if there are differences between the estimated factors and gender, age and income, through Mann-Whitney and Kruskal-Wallis tests, because they are independent samples, and the variables/factors are natural ordering by definition.

The analyses performed in this project sought to evidence the contribution of a program in adherence to treatment for hypertension, diabetes and dyslipidemia, with regard to the use and acquisition of medications. The program provided technical guidance regarding the use of the drugs present in  the prescription, analyzed  the interactions between the drugs prescribed by different professionals and finally made the replacement of the drug in the scheduled time, offering the possibility of greater treatment adhering. A sample of individuals who were part of the program was compared with another sample of individuals without pharmaceutical follow-up to evaluate the differences between the periods of acquisition of medicines for hypertension, diabetes and dyslipidemia.

To solve the problem, I used several non-parametric tests: Mann-Whitney Test (U), Tarone and Breslow-Day homogeneity test and Cochran and Mantel-Haenszel conditional independence, Spearman correlation  and Chi-square test.

In this project, I checked whether the degree of importance of environmental practices influences the perception of the quality of services. For this, a questionnaire was applied, with five questions that measure the degree of importance of some environmental practices, and the SERVQUAL scale for libraries, in the measurement of the quality of services. And while  my reading of the project  did not  find a model that directs the impacts,  and therefore, I tried to solve the problem through bivariate correlations, without the need to infer the causal relationship between the variables.

However, before proceeding the Spearman Correlations between the variables of interest, I computed the reliability of the SERVQUAL scale. In the case of the five variables of degree of importance of environmental practices, busi evaluated these items  through an Exploratory Factor Analysis (EFA) to see if they form any common factor. Thus, with the reliable dimensions of SERVQUAL and with the environmental practice factor extracted from the EFA, we relate  these variables to the profile variables through the Kruskal-Wallis Test and Spearman’s Correlation, and then relate the factor of environmental practices with the dimensions of SERVQUAL through Spearman’s Correlation.

This project was simple: it was only a descriptive analysis of aggregated data.  Law 10.168/2000 instituted the contribution of economic intervention – CIDE – aimed at financing the Program for Stimulating University-Company Interaction to Support Innovation. The legal instrument provided for some percentages of resource allocation, and I evaluated the historical data to see if this forecast was being met.

In this project I evaluated the association between market orientation constructs, learning orientation and perceived business performance in a sample of few respondents. It was based on a theoretical  model based on the literature, directing, sharply,  to a  problem of structural equation modeling (SEM), however, as the  sample size was very small for this type of technique, and the very poor measurement instruments, I used a simpler approach: nonparametric correlations:  Spearman correlation and Kendall’s Partial Correlation;  because they are variables with ordinal measurement level.

In this project I executed a technical report with the   objective of elucidating statistical facts related to sample calculations  and assisting the decision given by the Court, clarifying in a technical, clear and objective way the questions presented with the origin of the spreadsheet of glosses audited in  a private hospital in the state of Minas Gerais.

In this project I identified, estimated and profiteer the financial costs on payroll arising from sick leave of workers for accidents / illnesses of workers in the poultry slaughter, processing and storage industry. We analyzed 3,526 cases of withdrawals actually occurred in an industry of slaughter, processing and storage of birds in the period from January 2005 to December 2007, duly proven by medical clearance reports. These cases, whose information is not complete and may refer to the same worker with more than one leave, were crossed with the company’s payroll, where it became possible to estimate the costs that include wage payments, receipts and compensation to the worker, collection of taxes and contributions on payroll,  etc. The profile of these costs was delineated in view of gender, functions, sectors and types of diseases due to leave.  For this, initially, an exhaustive work was done to clean and structure the database that was made available to me.

In essence, after the description of the research variables (descriptive statistics), we use nonparametric tests to make inferences about possible profile differences between estimated costs, days and salaries. To study the relationship between gender and estimated costs, days and salaries, the Mann-Whitney Test was  used and in the case of the other profile variables (function, sector, disease and refrigerator)  the Kruskal-Wallis Test with the multiple comparison of Dwass-Steel-Chirtchlow-Fligner (DSCF).

In this project I evaluated the problem and describe the quantitative  research procedures that were used in the research “Internationalization of R&D: analysis of the degree of technological complexity attributed to multinational subsidiaries”.   A text was proposed for the sponsor to include in  the research project, in the section dealing with the methods of the research, and therefore, it is a succinct description. The subject was on the application of the methodology of Structural Equations Modeling (SEM) to the research data.

This project dealt with a proposal of statistical methodology that was used in the research “The relationships between environmental and organizational variables in the performance of startups”.

Initially, the research problem would result in the use of Structural Equation Modeling (SEM), however, I left margin if  a sample and questionnaire suitable for the  use of SEM were not obtained, to  continue with the quantitative approach, and make use of i) a descriptive analysis; ii) bivariate analyses, such as correlation analysis and hypothesis tests (parametric or non-parametric); and iii) even some modeling, such as Exploratory Factor Analysis (EFA).

In this project he was  a consultant to investigate the perceptions of the high and middle management about the financial sustainability found in the APAES of Goiás and Espírito Santo, through the analysis of their management practices and the resources they use for their survival, taking into account the fulfillment of the respective institutional missions. Itis a survey with a qualitative and quantitative approach. For data collection, a semi-structured questionnaire containing open and closed questions was applied. The data were analyzed, and descriptive statistics were used, with reference to the objective items of the questionnaire and Content Analysis for the open questions.

Asset 16

Economics

In this report, I evaluated the relationship between Frivolous Actions of a Consumer Nature and Judicial Gratuity over the years 2016 and 2022 in the civil courts of the Court of Justice of the State of Paraná. To that purpose, from 2016 to 2022, we have a sample of 2,649 lawsuits characterized as frivolous (yes/no) and gratuitous (yes/no), separated by the outcome of the lawsuit: settlement, extinguished, baseless, well based, and without judgment. In summary, the goal was to find a relationship between the variables FRIVOLA and GRATUIDADE, which were controlled by the years and the results. The first control was chosen because it is suspected that the contextual environment influenced the relationship, and the second, while not complying with the causal aspect of precedence, was viewed as an estimate of probabilities (chances of a positive outcome) made by economic agents when filing a lawsuit.

To propose a binary logit model (logistic regression), I conducted descriptive analysis (bar graphs and frequency tables) and bivariate analysis (χ2 test of independence, odds ratio (OR), and Cramer’s V) on relevant associations. The studies were done out using jamovi 2.3.21.

This project aims to relate the perception and expectation of basic education teachers to school performance at the aggregate level of Brazilian citizens, following the approach of spatial econometrics. To this end, secondary data extracted from National System of Evaluation of Basic Education/SAEB, 2013 and 2017, was used, and the results in the standardized tests of 9th grade elementary school students in mathematics and portuguese language in those same years. Within the spatial econometry approach, Exploratory Spatial Data Analysis (ESDA) and Spatial Regressions (SAR, SEM and SAC) were used, following closely the books of Almeida (2012) and Golgher (2015), using Geoda and Geodaspace software. However, before spatial modeling, it was necessary to create the variables of professor’s perceptions and expectations from the questions of the SAEB teacher’s “Learning Problems” questionnaire. The teachers were asked to give their opinion on their perception of possible learning problems of the students due to items Q70 to Q82, and for this, I used a  Principal Component Analysis (PCA)  for extraction of three components, according to previous evidence in the literature.

With this project I intended to investigate the relationship between economic growth and financial credit in Brazilian sub-regions between 2005 and 2016. For this, I had information from Brazilian municipalities aggregated in 84 sub-regions between 2005 and 2016, i.e., a balanced sample of panel data with 1008 observations (12 years × 84 sub-regions), and eventually would lose information for one year or two, dependent on the modeling used.

I sought to adjust four models, one to test each hypothesis, from two different approaches: 1) considering a static panel, in which the variables enter the model at the same time (same period t); and 2) considering a dynamic panel, in which the dependent variable enters the model with a lag (in t-1). To the model were additional three blocks of control variables and time dummies (time effects). For each of the decisions considered, or hypotheses suggested about the process that generates the models, the appropriate recommended tests or diagnoses were performed. To evaluate whether dynamic panel models were more appropriate than static panel models, for example, I saw for persistence coefficient significance.

In the context of static panel models, the general procedures were performed: i) Wald’s modified test for herocedasticity;  ii) bias-corrected LR test for serial autocorrelation of order p in panel data; iii) Bresch-Pagan test to detect whether random or pooled effect; iv) Chow test to detect whether fixed or pooled effect; v) Hausman test to detect whether random effect or fixed effect. Additionally, he also used adjustment measures (R2 and Adj. R2) and information criteria (RMSE, Log-likelihood ratio, AIC and BIC) for evaluation and choice of models, as well as also considering the general tests of the models (F and Wald) and the statistical significance of the coefficients, which in the present study were  elasticities, because all variables were in logarithm.

Historically, in the scope of economics and finance, models estimated by Difference or System GMM (Generalized Method-of-Moments), 1 or 2-steps in problems related to dynamic panel About the choice between diff estimator or Sys-GMM were used by the recommendation of the literature by highlighting a better efficiency of Sys-GMM in i) small samples and presence of heteroscedasticity and autocorrelation,  and ii) the rule given by Bond (2002): use Sys-GMM if the coefficient that multiplies the lag dependent variable is close to 1. Additionally, it is emphasized that the 2-Step Sys-GMM procedure was used, due to the presence of heteroscedasticity and serial correlation, with the correction for small samples, according to Windmeijer’s proposal.

Finally, to tackle the problem of estimation of instruments, I used of the proposal Kapetanios & Marcellino (2010), in order to avoid the proliferation of instruments (> number of sub-regions). To evaluate the models estimated by 2-Step Sys-GMM I used: i) Hansen’s J test for the validity of the instruments and exogeneity of the regressors; ii) Arellano-Bond AR(2) test to identify evidence of second-order autocorrelation; iii) the significance test of the parameter that multiplies the lag dependent variable to justify the use of dynamic panel techniques; (iv) the evaluation if number of instruments < number of sub-regions; v) of the general significance test (F test) of the model; and vi) of the adjustment measures: % Explained Variance (>70%) and KMO (>0.70); of the estimated main component model to reduce the dimensions of the instruments. In i) and ii) I followed the conservative guidelines proposed by Kiviet (2020): statistics of tests with p-value > 0.20. I use the Stata v.15 software for the analysis, mainly using the subroutines xtreg (static panel) and xtabond2 (dynamic panel).

This project sees you as a goal to investigate the relationship between economic growth and financial credit in the sub-regions of Rio Grande do Sul between 2006 and 2019. For this, it had information from 497 municipalities in Rio Grande do Sul, in nine sub-regions between 2005 and 2019.  The dependent variable is the variation in the GDP of these nine sub-regions and the independent variables of interest are related to each of the hypotheses: variation of the Total Credit Balance; variation in the Balance of Legal Entities Credit;  variation of the Personal Credit Balance; and the variation of the Legal Entities Credit Balance and the Directed Credit Balance of the Natural Person (Rural and Real Estate).  Thus, the intention was to adjust four models, one to test each hypothesis, and four other control variables were added, common to the four models: 1) Delinquency; 2) Long-Term Credit; 3) High Risk Credit; and 4) Number of Credit Operations. All of these variables are also percentage variations.

As we have aggregated information of the entire population of interest (sub-regions of Rio Grande do Sul), and the research project emphasizes the intention of understanding the behavior of an individual economic unit, the fixed effect model (EF) for panel data show or the most appropriate statistical technique for the present purposes. In studies of a macroeconomic nature, in which the selection is made by nature or conjuncture, the correct decision is to use EF. The idea is that any inference has to be about that group.

This project is related to support in the delimitation of the statistical methodology employed in the research “Teacher expectations and performance of 9th graders in Brazil”. The sponsor wrote the first part, where I had the contextualization of the problem, and in light of the type of data and problem presented, I proposed the methodology/statistical approach that would be used in the execution of the research.

Thus, the proposal was to adjust  a  Spatial Econometric Model with as dependent variable (Y) the municipal average proficiency in mathematics or Portuguese of 9th grade students in 2020 (SAEB/Prova Brasil) and the average municipal teacher expectations as a variable of interest (X).  To control the relationship, the indicators of teacher support, adequacy of the teacher’s training of basic education, complexity of school management and socioeconomic level of basic education schools were included.

The general objective of this project was to verify the relationship between the efficiency of the applications of resources destined to the function of government Education (subfunction High School) by Brazilian states with the Human Development Index (HDI) between the years 2005 and 2017.  For this, the first step was to measure what  would be efficiency of applications and later, from panel information of all Brazilian states, relate to hid.

The first step was performed by the sponsor, through the DEA methodology, and I went into the second step, doing in preliminary terms, for description, an Exploratory Analysis of Spatial Data, and later, through the  panel data methodology, I rotated models for panel data with fixed effects  estimated in two steps with instrumental variables (2SLS).

This project dealt with a macroeconomic analysis, from a public database of 38 corporate debt securities of 11 Brazilian companies, between 2011 and 2016, studying the ability of structural models to explain the variations in the credit spread of firms. Based on the literature, the proposed model used a  composition of explanatory variables common to the market and specific explanatory variables of the firm, in order to identify the explanatory power of structural models on credit spread variations.  For this, I used a panel data modeling, however, with standard errors estimated by Fama-Macbeth, as proposed in the literature.

In this project I evaluated the relationship between child labor and participation in social programs (Bolsa Família) of the federal government. To achieve the objectives, information was obtained from the PNAD of 2014 and several sociodemographic variables. In possession of the database, the variables were restructured to become more intelligible, however, despite an immense sample size, the small number of children aged 5 to 9 years sampled who answered who worked (n=125) indicates that perhaps the sampling strategy taken from the PNAD was not adequate.

Thus, I proposed a multivariate model that takes into account rare events: Log-log Complementary or Log-log Complement.  And it does not eliminate the problems arising from the incorrect design of the research (sample stratification).  A Descriptive Analysis and some Bivariate Analyses were also proposed.

This project dealt only with a methodological proposal, in which the objective   is to evaluate one and propose the appropriate statistical techniques to fulfill the following research objective: “Identification of the drivers of innovation internationalization”; and to test the following hypotheses: H1: Companies with international experience tend to internationalize their innovation activities; H2: Companies with absorption capacity tend to internationalize their innovation activities; H3: Companies located in countries whose National Innovation Systems do not favor the practice of innovation, tend to internationalize their R&D and innovation activities in general.

Due to the low scope of the research, such as the expectation of collecting few observations and being a preliminary exploratory analysis, only a Descriptive Analysis and Bivariate Analysis based on nonparametric tests was proposed.

The present project was divided into four activities  : 1) Study of the relationship between Social Networks and Level of Knowledge; 2) Social Networks and Correct Voting; 3) Proportional Correct Vote; and 4) Consequences of Political Ignorance; whose analyses should be based on the same large database, which should even be structured, organized and cleaned. I did it at Stata. In general terms, the project focused on the presidential and federal deputy elections and the central   question was: Are social redes easier to vote correctly or, on the contrary, create obstacles to achieve this end? The project also aborded the issue of correct voting in complex electoral contexts (CECs) and no this point, we also seek to define whether voters correctly in legislative elections.

In addition to structuring, organizing, cleaning and designing variables to meet the objectives, which was an immense work, technical techniques were considered to address the research problems, such as: Descriptive Analysis, Bivariate Analyses, Binary Logit Models (Logistic Regression) and Logit Ordinal Model (with Stepwise’s procedural).

It was an extensive project, in which I sought to answer whether there was evidence of a bubble in the Brazilian real estate market during the period between 2001 and 2013  and which the factors that most influence the dynamics of residential real estate prices in the Brazilian real estate market.  I have available variables of monthly time series in the period, and from this, I used time series analysis techniques.

To identify the presence or absence of bubbles, I  used the Johansen Cointegration Test, among the proxies of fundamental values and market value of residential real estate in the Brazilian real estate market. Additionally,  I also performed the Granger Causality Test between these two proxies. In order to verify the influence of determining factors on the price of real estate, I estimated a system of Autoregressive Vectors (VAR), relating some variables of the real estate sector and civil construction. After adjusting the model, I analyzed the Impulse-Response Function (FIR) and the Variance Decomposition (DV).

Asset 19

Finance

In this report, I investigated the moderating role of Corporate Reputation (CR), Advertising, Publicity, and Communication (PPC), and the COVID-19 crisis in the relationship between Corporate Social Performance (CSD) and Corporate Financial Performance (CFD). I examined the influence of DSC on DFC and assessed the moderating effect of CR, PPP investments, and the COVID-19 crisis on the relationship between DSC and DFC.

Before estimating the models for testing the study hypotheses, I ran a descriptive and bivariate analysis on the database variables. To avoid difficulties with survivorship bias, I did not address the missing values; however, I did treat the outliers using two different approaches.

I utilized static linear panel data modeling. The parameters were calculated using GLS (Generalized Least Squares), which included special random effects (RE). The use of AE models over fixed-effects (FE) models was due to the fact that the independent variables of interest, DSC and CR, exhibited little variability in time, producing issues in the estimations. The random-effects (RE) model had the benefit of estimating all coefficients, even those that remained constant throughout time.

From the standpoint of the study objectives, I made an a priori judgment in favor of RE models, and sadly, despite their benefits, I ignored FE models. Although the RE model is consistent even if the underlying model is POLS (Pooled Ordinary Least Squares), I used the Breusch-Pagan test (POLS against RE) to decide between the RE and POLS models.

In addition, I followed the standard techniques for diagnosing the remaining hypotheses of the RE model. For general model evaluation and comparison, I employed the significance tests (set) F and Wald of the R2 and RMSE fit measures. For local assessment and to attack the study hypotheses, I used the coefficients’ significance tests (individual and set).

In practical or interpretative terms, because I used Stata v.15 software for the analyses, I evaluated the marginal means (predicted dependent variable values) and marginal effects (incremental coefficients) after model adjustment, given the presence of the interaction, using cross-tables and graphs.

This project sees you as main objective to analyze the relationship between Tax Aggressiveness (TA) and Corporate Social Responsibility (CSR) in companies from emerging countries. Specifically, I would also like to: i) describe the level of TA in publicly held companies in emerging countries; ii) analyze the relationship between TA and CSR through the collected metrics; and iii) to identify the tax and CSR behavior, considering specific characteristics of emerging countries. To this end, there is a sample of 152 non-financial companies from 20 emerging countries between 2016 and 2021, with about 10 variables at company level and 5 variables at country level.

The descriptive analysis indicated the need for data winsorization, and then sought to estimate the models proposed in the initial project. To estimate the models, two questions  were considered for the choice of multivariate techniques: 1)  at hand had data from a  balanced panel of companies and the methods should take into account this structure; and 2) among the objectives of the research project, it was necessary to evaluate the influence of factors that do not change between individuals (companies),  such as estimation of contextual coefficients. Thus, the natural path, recommended by the specialized literature, was the use of methods for panel data modeling, specifically those in which the dependent variable is continuous and there is no lag term on the right side, that is, the methods for data in static linear panel were used.

However, of the three fronts generally used in empirical research with panel data: i) pooled models (POLS), ii) random effects models (RE) and iii) fixed effects models, due to question 2) commented above, it could use only the first two fronts, because in the case of employing iii), you would not estimate the  coefficients of the variables at the country level.

In this sense, although the RE model is consistent even though the true model is POLS (Pooled Ordinary Least Squares), I applied tested Breusch-Pagan to choose between the two types of models (RE versus POLS). Additionally, other diagnoses were used: multicolinearity (VIF), Wooldrige’s first-order serial autocorrelation test, in the case of the heteroscedasticity hypothesis, I used the Pesaran test. The models were estimated by Generalized Least Squares (GLS) with correction of heteroscedasticity and autocorrelation when verified.

In this project I sought to identify how the tax management undertaken by companies can relate to the generation of added value and their contribution to employees, creditors and shareholders. For this, there are observations of 321 companies between the period 2010 and 2019, whose variables were collected from information from ValorPRO. The database was not complete , that is, it presented diverse missing value, and from the research project and requested by the sponsor we estimated panel data models.

Initially, I cleaned the data, evaluating the univariate (winsorization) and multivariate (dfits) outliers, and then estimating four panel data models. These models were estimated according to the hypotheses about the error term: random effects (EA) or stacked models (Pooled). The choice of the most appropriate estimator falls on the Breusch-Pagan test. The additional hypotheses of heteroscedasticity and autocorrelation (first order) in the context of the panel were verified by Wald and Wooldridge tests, respectively. If such hypotheses were rejected, models with robust standard errors were estimated regarding heteroscedasticity  and first-order autocorrelation. First-order autocorrelation is  estimated from the Durbin-Watson statistic.

In this project I analyzed if there is any relationship between the tone of tax accounting narratives and the level of tax aggressiveness of publicly traded companies. For this investigation there are some economic and financial variables, with information from 297 companies listed in B3 during the years 2016 to 2020. However, the data matrix was not complete, and preliminary evaluations indicated several outliers, which were treated by the winsorization process and in the multivariate context (residues of the regressions of the estimated models).

Thus, following the procedures indicated by the literature adjustment of linear models with panel  data (short), and preliminary, I performed the Breusch-Pagan, Chow and Hausman tests for the choice of the most appropriate model.  I checked the presence of multicollinearity, heteroscedasticity or autocorrelation through appropriate diagnostic tests.  In the presence, I have made the strategies recommended by the literature in its transposition.

In this project I analyze the impact of financial flexibility on the capital structure of Brazilian companies opened on the stock exchange between 2005 and 2020.  I had available a database with economic and financial information of 339 Brazilian companies traded in B3 between the years 2000 and 2020.

I applied the panel data methodology, with estimation of fixed effects models (EF), random effects (EA) and pooled, and para help the decision to choose the most appropriate model for the data used by the Breusch-Pagan (EA x pooled), Chow (EF x pooled) and Hausman (EA x EF) tests. Additionally, I checked for problems related to multicolinearity, heteroscedasticity and autocorrelation, from the variance inflation factor (VIF) for each of the independent variables, the Wald test and the Wooldrigde test, respectively. In the evidence of problems, corrective measures were adopted. Before the estimates of the models, I did a Descriptive and Bivariate Analysis, and data cleaning (treatment of outliers).

In this project I evaluated whether the overconfidence personality trait influences tax aggressiveness in Brazilian publicly held companies.  I had 2,216 observations from 277 companies between 2010 and 2017, whose variables to measure overconfidence, tax aggressiveness and their controls were constructed from information from Economática® and ComDinheiro.

To estimate the equations of the theoretical models, I used the panel data methodology, and thus, the models were estimated according to the hypotheses about the error term: fixed effects (FE),  random effects (RE) or pooled models. The choice of the most appropriate estimate falls on the tests: Breusch-Pagan (Pooled x EA);  Chow (Pooled x EF); and Hausman (EF x EA). The additional hypotheses of heteroscedasticity and autocorrelation (first order) in the context of the panel were verified by Wald and Wooldridge tests, respectively. As such hypotheses  were rejected, I estimated models with robust standard errors regarding heteroscedasticity or from the Prais-Winsten estimator, which can correct the heteroscedasticity and first-order autocorrelation, under the hypothesis or not of contemporaneously correlated errors. This estimator is recommended to the detriment of Generalized Least Squares (FGLS) when you have unbalanced panel, and just like FGLS, it is possible to predict a single autocorrelation for all data or specific to each panel (individual). First-order autocorrelation was  estimated from the Durbin-Watson statistics.

The objective of this project was to analyze the relationship between the forecast of financial analysts covering Brazilian banks and tax information. To this end, we have a database of 1,068 incomplete observations, from 89 financial institutions over the course of 2006 to 2017.  The main objective with the analyses was to develop theoretical models that relate Analyst Forecast versus Tax Information controlled by tax smoothing, size, growth and result of the firm and analyst performance.

The models were adapted to the literature and estimated by the panel data methodology, with fixed or pooled effects.

The objective of this project was to evaluate the relationship between contingency factors and executive compensation. To meet the objectives of about 200 observations collected from a questionnaire applied in the companies, via telephone interview, with about 20 variables to measure contingency factors (strategy, structure,  environment and technology), executive compensation (time to pay remuneration and proportion of fixed/variable remuneration) and some control variables (performance,  size and sector).

In methodological terms I started by presenting  a description of the variables, in a second moment, I related the dependent variables against the independent variables, individually, performing simultaneously, non-parametric tests to investigate possible bivariate relationships between the variables.  Due to i) the potential interdependencies between the research variables (Performance ↔ Remuneration); and ii) some variables were measured with errors (Contingency factors) we chose to use the Structural Equation Modeling (SEM) approach to solve the research problem. Additionally, as the variables of remuneration and payment time of remuneration were ordinal or nominal measurement level, and SEM uses the parametric correlation matrix (Pearson) to estimate the parameters of the model, we chose to use the Generalized Structural Equation Modeling (GSEM),  whose extent takes into account the level of original measurement of endogenous variables, using models and specific binding distribution for them, such as binary probit, binary logit, ordinal logit, Poisson, etc.

In this project I had as mission to know the opinion of the CEO’s, CFO’s, Controllers regarding the influence of contingency variables strategy, structure, and size in the (in) formalization of the strategic planning practiced by organizations.  It had in hand a questionnaire applied to the target audience and information of the companies of individuals, such as: size, billing, strategy and structure that most identifies the company.  I had to use my creativity to create some variables to meet the goals.

In addition to a Descriptive Analysis, I had to rotate, prior to the tests of the project’s hypotheses, an Exploratory Factor Analysis (EFA), for some specific questions, to see if they formed a factor. Next, I performed some nonparametric tests: Spearman correlation, Mann-Whitney test and Kruskal-Wallis test.  The closing of the analyses was through stepwise Multiple Linear Regression models, with the inclusion of each of the independent variables that identifies the characteristics of the companies as dummy. The methodological strategy that culminated in the statistical techniques used in the analyses was justified or for four main reasons: 1) small sample; 2) one of the measurement models adjusted by the EFE was not good: consider it to stay in the analyses, however, at the limit of the acceptable; 3) there was not so much correlation between the factors estimated by the EFA ; 4) the theoretical model presented in the research project was imprecise.

In this project, I evaluate the influence of the comparability of financial statements on the accuracy of the forecast of market analysts and on the information of the profits disclosed by the companies in the period from 2005 to 2015. For this purpose, a sample of 37 companies was intentionally selected for this purpose.  The period investigated comprises the years 2005 to 2015, thus enabling: to analyze the information prior to the adoption of IFRS (data from 2005 to 2007); transition period (2008 and 2009); and mandatory adoption period (2010 to 2015). Data were collected from the Thomson ONE Analytics System Database and the Economática® System Database.

As I had the same individuals observed over time I applied the panel data methodology, which by the way  was fully balanced, and thus the models  were estimated in view: 1)  pooled models; 2) random effects; and 3) fixed effects.

This project was quite comprehensive, because I had to work with 14 different model of measures, among several behavioral finance constructs,   such as: overconfidence, optimism, anchorage, framing, illusion of control, availability, affection, fallacy of the conjunction, representativeness, aversion to loss etc.    , Optimism, Disposition and Aversion to Loss.  From a sample of about 3,000 individuals, with several missing value, I first worked these cases, to postretirement begin the analyses.

Thus, from four measurement models I started with models of Exploratory Factor Analysis (EFA), to then refine them from models of Confirmatory Factor Analysis. Within this context I also calculated reliability and evaluated convergent and discriminant validity measures. Together with the other available measures, I adjusted a conceptual model based on Structural Equation Modeling (SEM) techniques, with subsequent examination of the invariance of formation and gender.

The analysis carried out in this project aim to evaluate the association of companies with practices of tax avoidance and social responsibility. To meet this objective, a sample of companies listed on the BM&FBOVESPA that participate and do not participate in the Corporate Sustainability Index (ISE) was available. Information was collected between 2009 and 2013  with an unbalanced panel.

I was asked to perform a Cluster Analysis in order to find some natural classification among companies, based on the available variables.  The sponsor believed that the magnitudes of tax avoidance and the option of adopting social responsibility practices can serve to separate companies within logical segments or groups, based on preliminary evidence of dependence between these variables.

The statistical analysis performed in this project aim to evaluate the relationship between selected financial indicators: LPA, VPA, Sales/Action, ROA, RPL and ROIC; and socio-environmental investments (ISI, ISE and IA), between 2010 and 2014, in open Brazilian companies listed on the BM&FBovespa of the Steel and Metallurgical sectors.

Initially, I performed  a descriptive statistic of the study variables, and as six of them relate to financial performance variables, try to reduce the size (number of variables) by extracting the principal  components. The factors extracted in the previous stage were related to the socio-environmental investments (ISI, ISE and AI), at first through Spearman’s Correlation coefficient, and in a second moment, through Kendall’s partial post-order correlation coefficient, using the Total Asset (TA) of the companies as control. The use of these nonparametric tests was necessary due to the lack of normality of the data, tested using the Kolmogorov-Sminorv and Shapiro-Wilk Tests.

Asset 5

Health

The purpose of this statistical study was to assess the swallowing of individuals with stage 3/4 Alzheimer’s Disease (AD) using the Speech-Language Pathology Protocol for Risk Assessment for Dysphagia (PARD). For this, I used a sample of 59 people who responded to the PARD items (Padovani et al., 2007), and the goals were to i) describe the absolute and relative frequency of each item and test them to see if they differed, and ii) determine which items are most related to the classification of the degree of dysphagia. Aside from the things, the only further information I had was the patient’s age.

In terms of statistical methods, I used tables and graphs (bars) to show the absolute and relative frequencies of the PARD items, the Chi-square test (χ2) to compare the frequencies between the possible categories of each of the items, and the Mann-Whitney test (Z) to compare the degree of dysphagia between the classifications of the PARD items. Following this bivariate analysis, a multiple linear regression was done using the stepwise variable selection approach, with the dysphagia degree score as the dependent variable (RV) and the PARD items, which were significant in the previous stage, as independent variables (VI). The purpose of this method was to find which PARD elements had the best ability to differentiate the degree of dysphagia in AD patients. The analyses were carried out with the SPSS v.27 program.

The purpose of this paper was to compare 10 observations of laboratory tests of the liver, obtained just once, to the liver biopsy. Because the number of observations was small (n = 10), I limited myself to bivariate analyses. Because the variables have a minimum ordinal level, I calculated Spearman’s Correlation Coefficient along with their confidence intervals (CI). I previously provided various tables and graphs for descriptive purposes: for scalar measurements, the fundamental descriptive statistical metrics (minimum, maximum, mean, and standard deviation), and for ordinal/nominal measures, absolute counts. The analyses were carried out with the SPSS v.27 program.

The general objective of this project was to characterize the types of body pain presented by emergency teleoperators and to verify the relationship (correlation) between vocal complaints and painful body symptomatology. Additionally, it intends to compare, in specific terms, the responses and differences between the groups of militaries (police and firefighters) and non-military participants (first chance program trainees). The sample consisted of 71 emergency teleoperators, including military and non-military personnel.

The results were presented through descriptive statistics, the Mann-Whitney test (Z), Fisher’s Exact (χ2) and Spearman’s Correlation (ρ); seeking to highlight the arguments in the most segregated, informal and visual way possible. Adopted a  non-parametric approach, to the detriment of classical parametric tests, such as t-test and Pearson correlation, due to a small sample and mostly ordinal data, consequently, that do not follow a normal distribution, as evidenced in the Shapiro-Wilk and Kolmogorov-Smirnov normality tests.

This project was one of the longest I participated in, whose duration lasted for about four years, with exhaustive research, which lasted almost two years, and generated several products (five scientific articles).”  I participated from the beginning, assisting the design of the experiment, calculating sample and auditing of the questionnaires (measures) proposed.  Additionally, I monitored the data collection, to later create, clean, organize and structure several databases for analysis. Throughout the process I performed several analyses, ranging from descriptive and bivariate (ANOVA crossover) and adjusted several multivariate models, essentially via Generalized Estimating Equating (GEE) and Generalized Mixed Models (GMM).

In this simple project, the main objective was  to examine the levels of quality of life in individuals practicing bodybuilding. In specific terms, it is intended to: i) compare the quality of life of bodybuilding practitioners and sedentary individuals; ii) identify the level of quality of life in bodybuilding practitioners; and iii) identify the level of quality of life in sedentary people. For this, there was a sample of 20 individuals who answered the WHOQOL-Bref questionnaire, 10 of which were bodybuilding practitioners and 10 sedentary people.

Thus, in addition to descriptive analyses of the research variables and crosses in view of the groups under evaluation, the Mann-Whitney (U) nonparametric test was performed to compare the scores of the four dimensions of the WHOQOL-Bref and the general scale in relation to the two independent groups (practitioners versus sedentary ones). The choice of this test to the detriment of its parametric similar (t-test) was due to a small sample and is ordinal variables (scores), which by definition were not normally distributed.

In this project I evaluated the HALFT scale in terms of adjustment to the proposal of cross-cultural adaptation.  Thus, I executed Exploratory Factor Analysis (AFE) models from a polychoric matrix, because they are binary items, with parameters estimated by Robust Diagonally Weighted Least Squares (RDWLS). Dimensionality was evaluated by Optimized Parallel Analysis.  I advanced the analyses with the execution of models of Confirmatory Factor Analysis (CFA), using the same estimators, and Item Response Theory (IRT).

In this project I analyzed the data of patients with COVID-19 and evaluated the risk factors associated with the death of some patients. This was a retrospective study, whose data were collected from the patients’ medical records. In addition to identifying the patient’s profile (admission data)there were surveys  of pathological personal history, diagnosis at admission, previous medication, clinical symptoms, laboratory and clinical tests, information on the ventilation used, clinical complications and classification at the time of diagnosis.

As if there are variables in most dichotomous, including the outcome, the greater emphasis was on the calculation of odds ratios and  their respective confidence intervals (  for this we used the bootstrap method with n = 1,000 and corrected and accelerated for the days) and the Fisher’s Exact Test (χ2). For the variables that are scaling, the t-test   (corrected or not for lack of homogeneity) was used. In multivariate terms, I advanced according to the sponsor’s recommendation: 1) adjust and  a common factor analysis model;  and 2)  a backward Logistic Regression Model.

In this project I described and related information of etiology, forms of presentation and complications of cirrhosis, against age and other profile variables.

To perform the crosses (bivariate analyses) the following statistical techniques were adopted: i) Point-biserial correlation; ii) Kruskal-Wallis test; iii) Cramér V; and iv) Fi coefficient. Measures i), iii) and iv) are the effect sizes themselves, and in addition to statistical significance, it also gave me a dimension of the clinical relevance of the findings. I adopted a p-value less than .05 for statistical significance (p-value < .05) and all confidence intervals were estimated by bootstrap (n = 1000; CI95%: BCa) to solve problems arising from the lack of normality.

In this project I evaluated and compared social fragility and its relationship with physical frailty in patients undergoing hemodialysis and kidney transplantation. To pursue this goal, I had information from 80 patients with medical diagnosis of chronic kidney disease undergoing hemodialysis and 204 patients who underwent kidney transplantation in a kidney transplant unit located in Brazil.  To measure social frailty, the Halft Social Frailty Assessment Scale was available, and the physical domain of the Tilburg Fraity Indicator (TIF) was available to measure physical frailty. Additionally, the Patient Health Questionnaire (PHQ-9) and the Social Support Scale of the Medical Outcomes Study (MOS) are also available. Each of these scales   generated scores for the analysis s, the availability of  sociodemographic and economic information (gender, age, education, marital status, income, housing) and some health conditions (associated morbidities, treatment time, medication use, etc.) that serve as a control in multivariate analyses.

As HALFT had not yet been applied in the national context, preliminary, I ran  an Exploratory and Confirmatory Factor Analysis (EFA) only for it, to later calculate the reliability of all scales (Cronbach’s Ordinal Alpha) and rotate three Multiple Linear Regression Models.

The objective of this project was to evaluate the effects of short-term viscosupplementation on patellar chondropathy. In specific terms, we sought to observe differences between supplementation regimens (1, 3 or 5 applications), to observe the influence of variables such as age, gender, BMI, bilaterality, degree of chondropathy, time of knee pain, use of medications, frequency of stretching and level of physical activity; on treatment performance, which was measured by the Womac, Kujala and EVA scales. The central idea was to evaluate whether during treatment [before application (T0), 3 months (T1), 6 months (T2) and 12 months (T3)] with hyaluronic acid (HA) individuals presented positive evolutions regarding such scales.

In statistical terms, to identify the profile of the sample, a descriptive analysis was performed. To evaluate whether there was evolution/involution of the scores of the scales throughout the treatment, the Friedman Test (χ2) was used with subsequent multiple comparations adjusted by Bonferroni. In a second moment, a Linear Regression Model by Ordinary Least Squares (MQO) was adjusted without assumption of homoscedasticity and normality, using the  bootstrap method (n = 2000 corrected for bias) to estimate the confidence intervals of the model parameters.

The project aimed to describe the data from  a sample of tests with findings of endosonography and cytological. It was a base with few variables, but complex to analyze statistically (inference) due to i)  many missing values; ii) conditional variables (depend on responses from other variables); iii) variables with multiple responses; and iv) mostly with qualitative information.  Thus, I focused on evaluating the sample profile  through descriptive analysis, and to increase the analyses, I built a Word Cloud for the open response variables.

The statistical analysis performed in this project aimed at or evaluating the relationships between: a) professionals who have two specializations, that is, implantology and orthodontics, and relate the choice of the first specialization option with the reasons for this choice; b) the time that each professional would wait for the use of an already osseointegrated dental implant as an absolute anchorage resource for orthodontics; c) to evaluate the relationship between the use of osseointegrated dental implant and the use of total conical beam computed tomography of the skull or face for oral rehabilitation planning; d) to evaluate the relationship between the use of osseointegrated dental implant and the use of digital planning software based on a total tomography of the skull (or face) to elaborate the planning of some oral rehabilitation; e) to evaluate the relationship between the time that the professional would wait to install osseointegrable dental implant to use it as an absolute anchorage resource in orthodontics and whether he has already used this tool to also anticipate the oral rehabilitation process.

After data collection, Fisher’s Chi-square or Exact Test was used, because it was only nominal variables, without theoretical (relationships) model a priori.

It refers to a relatively simple project, in which the objective is to determine the frequency and sensory phenotypic profile of neuropathic pain among patients with painful leprosy-associated syndrome in a reference service in Central Brazil. For this purpose, it is related to observations of patients submitted to clinical evaluation in order to establish inclusion criteria and make the sociodemographic, epidemiological, clinical and laboratory characterization of the sample.  In this case, a Descriptive Analysis was sufficient with tables of absolute and relative frequencies and descriptive statistics and bar and histogram charts.

The objective of the project was a predictive model, based on Neural Networks, to predict violence against women.  A database was available where respondents were inserted about their sociodemographic profile, quality of life (WHO quality of life scale), feeling of security (scale) and domestic violence.  A variable outcome (binary) was constructed that indicated whether the woman suffered (=1) or not (=0) domestic violence, whose interest was to discover possible predictors, in relation to all other variables of the base, of the occurrence of domestic violence. More than 50% of the interviewees indicated that they had already suffered a type of domestic violence according to the research.

After studying some options of paths to be followed we define i) start with the method of Radial Base Function; ii) select the most important variables (>50%) by blocks, that is, I estimated a  model only for sociodemographic variables, another only with the items of quality of life, another model with the rest of the variables,  and so on. At the end of this process, there are some variables considered more important by construct and included them in  a final model to select the most relevant ones.  For the final model I built the database with three random partitions: Training (n=53%), Tests (n=13%) and Validation (n=34%). With these three samples simulated several multilayer perceptron architectures and the one that returned the most parsimonious and stable model was the architecture: i) with a hidden layer and hyperbolic tangent activation function; and ii) activation function of the sigmoid curve output layer and sum of squares for the error function.

The general objective of this project  was to evaluate psychomotor and cognitive losses and the risk of falls in healthy elderly with probable diagnosis of Alzheimer’s disease (AD).

In addition to Descriptive and Bivariate Analyses, in multivariate analysis I followed with Multiple Regression Analysis with the inclusion of the independent variable that identifies the groups as dummy, and the others selected in the previous step as a dependent and/or independent variable, since, theoretically, a relationship of dependence of cognition for motor functions and that for functionality is expected. Given the small sample, ordinal characteristic of the variables, lack of normality and homogeneity preliminary identified, we chose to rotate the regression models by bootstrap (n=10,000) stratified by the variable of groups and with correction of bias.

My insertion in this project was to organize, structure and clean up the project database, which was tabulated in Excel, and should be transferred to SPSS.  As it was a very “dirty” database, with several variables, several questionnaires, at some times, and many observations, the project team chose to ask my help in this project due to the potentialities of problems that could arise.

In this project I evaluated whether the marking with nanquim facilitates the identification of lymph nodes in the surgical act of tumors in the recurrent and/or pathological dissection leading to a greater number of units evaluated in the operation. To achieve this goal, I had  a sample of about 20 individuals who underwent rectal cancer surgery, half of whom were marked with nanquim and the other half was not marked.

After delimitating the sample profile from a Descriptive Analysis of the variables followed with the i) bivariate analysis, and later, with the ii) multivariate analysis. The first analysis involves the application of Mann-Whitney nonparametric tests and Spearman Correlation in the search for potential influencers of the number of lymph nodes evaluated, including between the marked versus unmarked groups, and the application of the Chi-square (Fisher’s Exact Test) and Mann-Whitney tests  between the groups marked versus unmarked against the other variables of the base in the search for possible confounders. For the second stage, select only the variables that were marginally significant (p-value < 0.10) in the bivariate analysis, and I applied  a Regression Analysis taking the number of lymph nodes as dependent variable and the influencers and confounding as independent variables. To alleviate the problem of sample size, the confidence intervals (CI) of the bi-varied tests were estimated from the Monte Carlo Simulation (n=10,000 and 95% CI), and in the case of regression analysis, the Bootstrap resampling method (n=10,000 and 95% CI corrected for bias).

In this project, the task was to validate the SERVQUAL scale in the context of the provision of services in the health area. This scale originated in 1985 with the work of Parasuraman and others and over the years has undergone refinements, including several applications/validations at the national level, mainly in the context of management/administration/business, specifically in marketing. This is because the instrument consists of a 22-item questionnaire that assesses the customer’s expectation (E) and perception (P) in 5 dimensions (tangibility, reliability, responsiveness, guarantee/safety and empathy), in order to obtain a customer satisfaction index [GAP(Q) = P – E].

As it was the validation of a scale in a different culture and in the different context/area for which it was designed, all the methodological steps inherent to the proposal were passed: translation/retranslation, semantic and content evaluation by professionals/judges, etc.; and thus,  the statistical techniques recommended in the construction/development of scales were applied: Exploratory Factor Analysis (AFE) and Confirmatory Factor Analysis (CFA).

The objective of this project was to evaluate the relationship, in patients with and without diabetes mellitus (DM) and in patients with and without the preserved ejection fraction, the following events: death by any nature or non-fatal myocardial infarction or need for interventions or stroke); against the following risk factors: age,  sex, smoker or ex-smoker, previous myocardial infarction, hypertension, creatinine, clearence, weight, cholesterol > 100, triglycerides >150, diseased vessels, angiographic data (uni/bi/tri arterial disease) and type of treatment received (angioplasty, surgery or medical treatment).

Since I had the start date of the patient observation and the date of when the event of interest occurred, I constructed a variable of days until the first event occurred to execute a Survival Analysis. Thus, from the interest groups, the Kaplan-Meier procedure is performed, and based on risk factors, a Cox Regression. However, before the Survival Analysis, I related the variables of the research with the interest groups, presenting the respective bivariate tests (Chi-square and ANOVA).  I used bootstrapping for inferential calculations.

In this project, the task was to validate three scales that deal with total quality management in hospitals: i) Organizational Culture Assessment Instrument; (ii) Quality Improvement Implementation II; and iii) Preparation of Health Services for Accreditation. These scales were collected in the same research instrument, which was answered by about 600 health professionals working in seven hospitals accredited by the National Accreditation Organization (level III) and Canadian Accreditation. The present scales are english translations and have passed face and content validation.

To achieve the objectives, we used Exploratory Factor Analysis (EFA) and Confirmatory Factor Analysis (CFA).

In this project, I compare the efficacy of dermoscopy and clinical examination in predicting free histological margins in basal cell carcinoma lesions of patients undergoing surgical therapy in the plastic surgery service.  Secondly, it was also intended to determine the relationships between clinical and dermoscopic demarcations with free, scant and compromised histological margins. Moreover, in descriptive terms, I advanced in  knowing which histological types (nodular, micronodular, sclerodermiform, superficial and metatypic) more frequent in the sample and their association with the histological margins of the lesions and the frequency of clinical data of patients in the sample, such as age, sex and location of the lesion.

Since almost all  variables were nominal, I used the Chi-square Test / Fisher’s Exact Test to evaluate the relationships of interest, according to the research objectives, except in the case of age, which I used the Mann-Whitney Test (Z). Later, I used a  Logistic Regression to examine the odds ratio of the explanatory variables (marginally significant in the bivariate analysis) of the free histological margin.

The analysis carried out in this  project aimed  to evaluate the impact of daily supplementation of 10g of leucine in healthy young people undergoing a 12-week strength training program for strength and muscle mass gain, using the amino acid isonitrogenous alanine as placebo.  It was a case-control study with observations before and after the intervention.

In these terms, in addition to a Descriptive Analysis, I performed the following analyses: i) maximum petition, strength resistance and ultrasound: t-test of paired samples for each of the variables in order to verify if there was a difference before and after treatment; and  an ANCOVA  considering each of the variables of interest after treatment as a dependent variable and the same variable collected before treatment as a control covariate, with treatment (leucine/placebo) as a fixed effect;  ii) nutritional and protein value per meal: t-test of paired samples for each of the variables in order to verify if there was a difference before and after treatment; and a multivariate ANCOVA (MANCOVA) in each of these reference groups. In this case, the set of variables of interest after treatment are the dependent variables and the same set of variables collected before treatment are the control covariates, using the leucine/placebo group as a fixed effect.

In this project I evaluated the construction (translation and adaptation) of a Range of Essential Competencies in Public Health. The original objectives of the study were: i) to translate and adapt to the Brazilian reality the document Regional Framework of Essential Competencies in Public Health for human resources in health in the Region of the Americas suggested by the Pan American Health Organization; ii) to build and validate an instrument to map competencies from the document referred to, translated and adapted to the Brazilian reality. However, according to our reading of the work and understanding of all methodological stages for the development (or translation/adaptation) of a psychometric instrument, I understand that the work did not advance in validating the instrument.

As a strategy for applying statistical techniques: 1) as only the questionnaires of the first round and the pre-test are the same, the first applied in a sample of 130 individuals and the second applied in a sample of 36 health professionals, we tested whether there are differences of opinion between these two samples by the Mann-Whitney test (U). If in general these opinions do not differ substantially following the samples of individuals   (1st, 2nd and 3rd round); 2) with the three related samples of judges, where the number of variables (items) differ in each sample, since they passed to the other round only items that did not reach 70% of total agreement,  the correlation (Spearman) and agreement (Kendall’s W) between the items for inferences on internal consistency; and 3) were evaluated in another attempt to evaluate internal consistency, i.e., the homogeneity of the sample of items in the questionnaire, using  Cronbach’s Alpha.

This project was a descriptive and retrospective observational study, whose data collection was through notes of admissions notebook and electronic medical records, where information of patients readmitted less than 72 hours after discharge from the ICU, during the same hospitalization, as well as those readmitted more than 72 hours after discharge and those not readmitted were tabulated. The APACHE II score and other variables such as age, gender, cause of admission and outcome of these patients were collected on the first day of admission to the HRAD ICU, from January 2014 to January 2016,   and the objective was to analyze the rates of early readmission during the same hospitalization and its relationship with the APACHE II score,  as well as the outcome of readmitted patients.

In   addition to a Descriptive Analysis of  all variables, segregated by the three interest groups (early readmission, late readmission and non-readmission),I  related the qualitative variables, we used the chi-square test (χ2) and to  examine the difference of these three groups in relation to quantitative variables, mainly the Apache II score, using the Kruskal-Wallis Test (H) with subsequent  multiple comparations.  Additionally, just to verify the influence of potential confusions on the relationship of Apache II with early, late and non-readmissions, I performed the Mann-Whitney Test (U).

In this project, I evaluated the individual factors and the context of the UBS associated with the cure and non-cure of pulmonary TB in Brazil.  It was a cohort study of secondary data: Sinan, PMAQ-AB and DAB;  I included all new cases of pulmonary TB terminated as “cure” or “non-cure”, attended in UBS that participated in the 2nd cycle of PMAQ-AB, in the period from 01/07/2013 to 30/06/2014.

And works  a Hierarchical Model for the inclusion of independent variables, composed by individual (Level 1) and contextual characteristics of the UBS (Level 2) and the municipalities of the UBS (Level 3) of care, making the formulation of a model with three hierarchical levels (individual, UBS and municipality).

The present project was an experiment with a case-control design, where 24 rattus novergicus albinos males, with a minimum age of 3 months and maximum of 5 months and minimum weight of 250 grams and a maximum of 375 grams, were divided into two groups in a randomized and random way: study (n=12) versus control (n=12); and the study group was submitted to two types of treatment: study (n=12) versus control (n=12); and the study group was submitted to two types of treatment:  sessions of hyperbaric oxygen therapy with follow-up time until completion of the process in three days (n=6) or five days (n=6).

In addition to descriptive analysis, I performed some bi-varied tests: Kruskal-Wallis, Mann-Whitney  and Chi-Square; to evaluate the research questions.

In this project, I evaluated the association of fatigue and the concentration of inflammatory biomarkers and changes in respiratory function of patients with Multiple Sclerosis (MS) with low level of functional disability. To achieve the objectives, a  case-control sample was presented, where about 40 patients diagnosed with remission-recurrent MS were paired with other healthy subjects (control) of the same amount, selected in the same environment where the patients live analyzed. Each control was selected and matched to a patient according to gender and age group.

To certify the quality of the experimental design and the fatigue measurement adopted, the following strategies  were used: i) application of the Chi-square Test between patients with MS and healthy individuals vis-a-vis the variables of body mass index, associated pathologies, drugs used, ex-smoker and physical activities practiced. The idea was to understand that the cases and controls are also homogeneous in these questions, in addition to the demographic ones; and ii) Cronbach’s Alpha calculation for the two available fatigue scales (FSS and MFIS), and given the classification obtained with the scales (absence vs.  presence of fatigue), to apply the Mann-Whitney Test to  verify whether they differ in terms of respiratory parameters. The idea was to use the scale with better reliability and that better classifies individuals with weakened respiratory parameters.

To meet the first objective: to evaluate the occurrence of fatigue in the sample; the McNemar Test was used for related samples because they are two binary variables. In the case of the second objective, as the concentration of inflammatory biomarkers  were scaling variables, I used of the Wilcoxon Test for related samples. To relate the presence of fatigue with the concentration of inflammatory biomarkers and respiratory parameters throughout I used logistic regressions (Logit Model) with random effects to account for the dependence in the sample.

In this project, I evaluated the support of primary health care (PHC) of a Brazilian municipality to the recommendations of the main clinical guidelines for diabetic care. Specifically: i)characterizes the profile of people with diabetes mellitus (DM) assisted by PHC in the municipality; ii) compare the number of procedures performed with what is recommended by the DM guidelines: number of medical consultations, nursing consultations, participation in operative groups, examination of the feet (clinical examination, use of the strain gauge, blood pressure measurements),   fasting venous glucose dosages, total cholesterol, HDL, triglycerides, creatinine, microalbuminuria and LDL calculus; iii) I verified the percentage of diabetics who are using oral medications, insulin or the association of both treatments;  and iv) verified that I found the percentage of hypertensive diabetics who are using CAIS or ANA.  This was an observational, descriptive, cross-sectional study, with the observation interval being the year 2013.

The analyses carried out in this project were performed in two concomitant stages: 1) Descriptive Analysis (descriptive frequencies and statistics); and 2) Univariate U analyses (Binomial Test and Chi-square Test) among the information contained in the database vis-à-vis the s guidelines of the Brazilian Diabetes Society (2013) and the American Diabetes Association (2016) taken as expected parameters.

The general objective of this study was to evaluate the health-related quality of life of patients undergoing prostate cancer treatment. To search the objectives, I had  a sample of about 200 individuals, whose information was collected from two instruments: 1) closed questions that characterize patients in sociodemographic and clinical terms; and 2) the Expanded Prostate Cancer Index Composite (EPIC) questionnaire, a tool for evaluating the effects of treatment on the quality of life of prostate cancer patients, validated and translated into Portuguese.

In preliminary terms, before going through the proposed objectives, I evaluate the reliability of EPIC. I assumed that the instrument was robust, translated and validated for Brazil following all the necessary steps to make it a reliable instrument at the national level. Thus, I calculated only the alpha coefficients to evaluate the reliability of the SCALES and SUBSCALES of the EPIC in the sample. I then met both secondary objectives through Descriptive Analysis. To fulfill the last secondary objective and evaluate the health-related quality of life of prostate cancer patients, perform a Bivariate Analysis, relating: i) the SCALES of epic and variable Treatment through the Mann-Whitney Test (U); ii) the EPIC scales and sociodemographic and clinical data through the Mann-Whitney and Kruskal-Wallis Test (KW) ; and iii) the variable Treatment and sociodemographic and clinical data through the Chi-square Test (X2). Additionally, to ratify/refute the significant relationships found in the bivariate tests, I wrote sociodemographic and clinical information and the variable Treatment simultaneously in five Variance Analyses (ANOVA’s): estimated by bootstrap, whose differences between the treatment variable and the scores were compared through the Bonferroni adjustment.

In this project I investigated the association of cognition, schooling and habitual practice of physical activity (PA) with the perception of quality of life (QOL) in the old women.  To meet the objectives I had a sample of about 500 old women who answered a questionnaire containing the following instruments: i) form with identification data and sociodemographic data; ii) Brazil Economic Classification criterion; iii) Brazilian version of the International Physical Activity Questionnaire-IPAQ, proposed by the World Health Organization (WHO), short form, version 8, for the evaluation of PA; iv) Brazilian version of the WHOQOL-BREF questionnaire and the Brazilian version of the WHOQOL-OLD questionnaire of the World Health Organization (WHO) for QOL assessment; and v) Mini Menthal State Examination (MMSE) for Cognition evaluation.

Before proceeding the evaluations of the associations studied, examine the reliability (Cronbach’s Alpha) of the scales used in the research sample.  Finally, in order to achieve the other objectives,   in addition to a Descriptive Analysis, I adjusted  two Multivariate Variance Analysis (MANOVA) with all QOL variables taken as dependent and the variables cognition and PA as factors and the variable of schooling as covariate. All tests needed to examine the assumptions of MANOVAs have also been performed.  I went in this sense, because a theoretical model with  a priori relations was not available, so that we consider a modeling of Structural Equations Modeling (SEM).

In this project, I evaluated the profile of perinatal and infant mortality in the city of Anápolis between 2012 and 2014, and its classification within the avoid ability criteria proposed by the list of causes of avoidable deaths due to interventions under the Brazilian Health System for children under five years of age.  It had information on deaths, subdivided in the fetal, neonatal early, late neonatal and post-neonatal periods, and classified as preventable and non-preventable.  Due to the number of missing values (lack of information in all variables) it was not possible any adjustment of multivariate models, such as logistic regression, which was initially considered in the initial planning. As statistical procedures, descriptive statistics and cross tabulations were used with bivariate tests: i) Mann-Whitney test; ii) Chi-square test; and iii) Mantel-Haenszel test of odds ratios independence.

It is a relatively simple project in which I evaluated the time of spontaneous speech of patients during medical consultation performed in a public network (Santo André, Seleta and Vila Rosa).  I had information the following variables: i) age (years); ii) speech time (seconds); iii) place of information collection; iv) sex; v) schooling (without full education); vi) income [without income up to 20 minimum wages (MS)]. The purpose is to assess whether speech time (outcome variable) is related to the other variables in the database (independent variables).

I evaluated the sample   profile through a descriptive analysis, bivariate the base variables with the outcome through nonparametric tests (Mann-Whitney and Kruskal-Wallis) and adjusted a Multiple Linear Regression.  However, before these analyses, the variables of schooling and income had many categories of response, with few observations in some categories, and may hide possible relationships between the independent variables and the outcome variable or even derail the calculation of some statistical tests, I opted to merge and/or create categories via method  CHAID (Chi-squared Automatic Interaction Detection).

The project aimed to describe the profile of the stigma affiliated in mothers of children with Autism Spectrum Disorder (ASD) and severity of symptoms in the city of São Paulo. To measure the affiliate stigma, the researchers used the Brazilian version of the Ass scale (Affiliate Stigma Scale), and the ASD the diagnosis based on the ABC (Autism Behavior Checklist) screening.

As there was a proposal to translate the ASS scale into the Brazilian context, initially, evaluate the content of the judges using Kendall’s coefficient of agreement to examination the items and criteria that guide the scale. In a second moment, with the test sample collected, I advanced a Principal Component Analysis (ACP) to identify the items to the three factors proposed in the original scale. In this phase, the components originated from the EFS were also evaluated for reliability (Cronbach’s Alpha). After I did the Descriptive Analysis (mean, standard deviation, absolute and relative frequency) of all variables of the questionnaire, I  related the components of the EFA to the ASS scale with all other  variables of the study in a Bivariate Analysis  with nonparametric tests, given the characteristics of the variables: i) among the variables with natural ordering, the coefficient  of  Spearman correlation; ii) for variables with only two categories tested the relationship using the Mann-Whitney Test; and iii) for variables with more than two categories, the relationship was tested using the Kruskal-Wallis Test.

In this project, I helped the sponsor adjust a model that intended to explain the use of anxiety medications without medical diagnoses in firefighters and paramedics, based on sociodemographic information, household activities, life habits, working conditions, psychosocial characteristics at work, exposure to stressful events at work and in life, morbidity and general health information.

Considering the prevalence of the outcome, the existence of association  was verified through  multivariate logistic regression, with manual removal of variables (backward selection elimination procedure).  All variables that presented p≤ 0.20 in the Bivariate Analysis were selected to enter the final model. P-≤ 0.05 and 95% confidence intervals that do not include value 1  were criteria for statistically significant association.

This project had a relatively simple objective: it aimed to evaluate the quality of life of workers working in regular and atypical work shifts. To measure the quality of life of workers, the brief quality of life questionnaire of the World Health Organization (WHOQOL-bref) was applied to 200 individuals: 100 workers who perform their work activities in regular shifts (Monday to Friday or Monday to Saturday) and 100 workers who perform their work activities in atypical shifts (12 h by 24h / 12h for 48h). The WHOQOL-bref questions and their domains (physical, psychological, social relations, environment and general) were compared, by student’s t-test for independent samples, between the habitual and atypical shift workers, as emphatically recommended by the sponsor.

The data analysis performed in this  project aim to respond to the following research objectives: What survival of those patients who developed acute kidney injury after cardiac surgery compared to those who did not? What risk factors related to acute kidney injury impact the survival of those who underwent cardiac surgery?

From a sample of individuals and a database with 25 variables, and 23 risk factors were identified, initially using a descriptive, bivariate and  multivariate analysis (logistic regression) to identify risk factors related to acute kidney injury and, additionally, the risk factors identified in the preliminary analyses, and the acute kidney injury itself, were analyzed,  through a Survival Analysis: Kaplan-Meier procedure and Cox Regression.

In this project I examined the opinion of three judges who evaluated MRI images to identify the lateral surface of the brain. For this, I built several graphs and tables to describe each of the identified images, and based  on Kappa statistics, and their classification, compare the groups of variables and types of analyses (T1 IR GRE versus T1 GRE; Right hemisphere versus left hemisphere; Inter observer versus Intra-observer and anatomical reference groups).

In this project, I evaluated the sociodemographic profile and clinical risk factors associated with schizophrenic (paranoid) individuals with a history of violent behavior.

o meet the objectives of the project: 1) analyzed  the internal consistency of the HCR-20 and MOAS scales through Croanbach’s Alpha; 2) I verified  the effectiveness of HCR-20 in predicting violent behavior (HCR-20 > 21) in schizophrenic patients (with a history of violence) under the hospitalization regime, through the analysis of the AUC (area under the curve) of the ROC curve  (receiver operating characteristic); 3) identified the main forms of aggression perpetrated through  the Correlation Analysis between MOAS and HCR-20, as well as the association of MOAS with the classification of violent behavior, obtained by HCR-20; and 4) with the reliability of HCR-20 and MOAS proven, and obtained the classification of violent behavior (HCR-20 > 21),  i attacked the general objective (to identify the main factors related to schizophrenia and the occurrence of violent behavior) through bivariate analyses, essentially Spearman Correlation and Mann-Whitney Test, between the HCR-20 scales and the other variables collected from clinical data and current situation.

From an  Access database I extracted  7823 samples from 1325 patients who sent their SP or OM samples, for diagnosis, for the research of the BCR-ABL fusion gene or for the molecular evaluation of the response to the established treatment. The central cohort of this study came from the hematology service in addition to the partner centers and consisted of only 21 individuals.

As this is: (i) a small sample; (ii) hypothesis tests between paired samples; (iii) hypothesis tests between independent samples; and (iv) bivariate relationships between scalar and nominal variables; opt-in to use the following nonparametric tests: 1) Wilcoxon test; 2) Mann-Whitney test; and 3) Spearman correlation.

This project aimed to describe the sociodemographic, clinical and in-hospital evolution characteristics of children and adolescents up to 14 years of age, victims of non-intentional external injuries that required hospitalization, living in the city of Florianópolis, in 2013.

The analyses carried out in this work were carried out in accordance with the objectives pursued. Thus: 1) I described the sociodemographic variables of children and adolescents and their caregiver; 2) the characteristics of injuries suffered; and 3) identified the  in-hospital evolution of children and adolescents who required hospitalization due to injuries suffered.  Additionally, I also advanced in a Bivariate Analysis (cross-tables and hypothesis tests) between the variables of interest (outcome and independent), and later explained the outcome in  relation to the independent variables through a Logistic Regression.

Asset 6

Psychology

I was involved in the validation of an instrument designed for studying and mapping the abilities of health managers.

Following the scale validation processes, I actively engaged in the orientation on the validation of the instrument’s content, semantics, and appearance, which was assessed by public health management professionals using a focus group approach. Following the definition, I helped with the orientation for using the instrument in a pilot project with health managers collected by the Lattes platform. Following that, I gave advise on the instrument’s psychometric validation using exploratory and confirmatory factor analysis. The goal was to see if the instrument could measure exactly what it proposed, assuring its safety and legitimacy for future research into mapping important abilities for Brazilian public health managers. My assistance was critical in ensuring the effective planning and validation of this tool.

This project was an advance from another previous project that I participated (“Does your personality explain your lie’s intention? A study on the personality, lie and mediation of the Planned Behavior Theory”) in which, from the update of the sample (n = 1,269), there was the proposition of a more complex structural model.

As in the previous project, I also used Structural Equations Modeling (SEM) based on variance (PLS-SEM), to the detriment of classical SEM, based on covariance, mainly due to the theoretical model of second-order formative constructs (HCM). The procedures for adjusting the proposed second formative-reflective order model followed those recommended by the literature, where the repeated indicators approach (HOC) were applied to higher-order constructs (HOC) twice. Do not take the two-step approach (LOC scores obtained at first as HOC items in a second moment), because the  model was not reflective-formative or formative-formative, and thus  could  have problems, because  the endogenous variable of interest precedes another predictor of interest. Additionally, I used Model B for the formative constructs (lie scale in the context of PBT) and Model A for the personality scale (reflective construct), and also the factor weighting scheme to estimate the parameters of the PLS model.

The criteria adopted for the evaluation of measurement and structural models are also the ones recommended by the literature: (i) data treatment with the outliers and missing value (ii) evaluation of reflexive measurement models (HOC) in terms of convergent validity (factor loadings > 0.7 and AVE > 0.50), discriminant validity (Fornell-Larcker criterion) and reliability (Croanbach’s alpha,  rho_A and composite reliability); (iii) evaluation of the formative measurement models (LOC) in terms of multilinearity (VIF < 3.3), absolute contribution (significant outer weights) and relative (outer loadings > 0.5) of the items, (iv) evaluation of the structural model in terms of collinearity (VIF < 3.3), significance and relevance of the paths [via bootstrap two-tailed (IC95%) by bias-and corrected and accelareted  (Bca) with n = 5,000), in-sample predictive power (R2), effect sizes (Cohen’s f2) and out-of-sample predictive power (via PLSPredict Analysis using Q2 RMSE/MAE measurements);   (v) evaluation of the robustness of the model through the analysis of nonlinear relationships (quadratic effects test); and vi) multigroup analysis (MGA) to verify potential differences in gender, according to the steps and procedures recommended in the literature, such as analysis of invariance of the measurement model using the MICOM (Measurement Invariance of Composites Models) technique.

As the conceptual model that was sought to adjust was also composed of reflexive measurement models, the Consistent PLS (PLSc) algorithm was used, in order to mimic classical SEM, and as strongly recommended by the literature in the present situation. To run MGA I also used the Consistent features  in SmartPLS 3.3,  such as Consistent Permutation and Consistent Multi-Group Analysis (MGAc).

In this project, I participated in the final phase, reviewing the use of quantitative techniques of cross-cultural adaptation of three measures of evaluation of (hypo)mania on children and adolescent: Children Rating Mania Scale – Parent Version (CMRS-P), Parent Mania Rating Scale (P-YMRS) and Parent General General Behavior Behavior Inventory – 10 item Mania Scale (PGBI-10).  Specifically, I worked on the review of the evidence of validity based on the internal structure –  through Exploratory Factor Analysis (EFA) and Confirmatory Factor Analysis (CFA), internal consistency and reliability parameters – and evidence based on the relationship with external variables – convergent and concurrent validity, as well I analyzed the reduced versions of the instruments,  based on the original reduced forms.  In addition to adjusting the models of measures proposed by the researcher, through the EFA and CFA, I also adjusted other analyses: Multigroup CFA and Item Response Theory (IRT).

In this project I intended to scan the relationship between personality and the intention of prosocial and antisocial lying. For personality evaluation, the Five Factor Model – FFM was used by the BFI-20 scale. To explain the variables closest to the behavior, the Planned Behavior Theory (PBT) was sought and a model of “Intention to Lie” was proposed for the purposes of the research. Both measurement models were validated in the context of the Partial Least Squares Structural Equation Modeling (PLS-SEM) approach in one of 658 individuals, which also contemplated a structural model.

We chose to use the PLS-SEM technique to the detriment of classical SEM, based on covariance, due to the complexity of the proposed conceptual model vis-à-vis the available sample size.  It was followed with the procedures indicated in the literature for the use of PLS-SEM. These steps and criteria adopted for evaluation of the proposed conceptual model were: (i) evaluation of the data with reference to outliers and missing value; (ii) evaluation of reflexive measurement models in terms of convergent validity (factor loadings > 0.7 and AVE > 0.50), discriminant validity (HTMT and Fornell-Larcker criterion) and reliability (Croanbach’s alpha, rho_A and composite reliability); (iii) evaluation of the structural model in terms of collinearity (VIF < 5), significance and relevance of the paths (via bootstrap two-tailed Bca with n = 5,000), in-sample predictive power (R2), effect sizes (Cohen’s f2) and out-of-sample predictive power (via PLSPredict Analysis using Q2 and RMSE/MAE measurements); and (iv) evaluation of model robustness through the analysis of nonlinear relationships (quadratic effects test). SmartPLS 3.3 software was used for the analyses.

In this project I did a comprehensive analysis of a database collected in Australia that was about the influences of planning for retirement. The predecessor variables studied were several other scales (latent variables), such as: financial literacy, financial control, financial attitude and behavior, financial anxiety, physical health and mental health. It was a longitudinal study with information collected in five moments.

In terms of analyses, several Structural Equation Models (SEM) were run, from the evaluation of measurement models (Confirmatory Factor Analysis) to the examination of structural relationships through mediation models with panel data, such as latent growth model and cross-lagged model.

In this project, I investigated the working memory and literacy of children and adolescents at risk for neurodevelopmental disorders. The research hypothesis was that literacy skills are related to the performance of working memory, and that there are age effects mediating relationships.  Working memory was measured from a subscale (IMO) of the Wechsler Intelligence for Children Scale (WISC), which assesses the intellectual capacity of children and adolescents in four domains, including working memory. Since I had available only the raw scores, calculated and tabulated from the WISC manual, I started from the hypothesis that it is a valid and reliable instrument in this sample and, additionally, I did not use the scores of the SNL subtest due to the large amount of missing.

In the case of   literacy skills,  the present  project proposed its own protocol, and therefore, I used a model of the Item Response Theory (IRT) consistent with the problem to seek evidence of validity of this protocol, and later, a Multiple Linear Regression (MLR) model to test the research hypotheses.  In the context of IRT, I used the Partial Credits Model (MCP), consistent with ordinal data, in which the higher scoring orders in cognitive tasks indicate greater ability.

The objective of this project was to evaluate which personal characteristics have an impact on the expectation of duration of romantic relationships in Brazilian studies. For this,  I have a sample of individuals who answered a research questionnaire containing the following instruments: i) Sociosexual Orientation Inventory (SOIR) of Penke and Asendorpf; ii) Rosenberg Self-Esteem Scale (AA); iii) Clinical Inventory of Self-Concept (AUT) developed by Vaz Serra; iv) Time Perception Scale of Romantic Relationships (PTRA) based on Buss and Schmitt and proposed in this research; and v) a sociodemographic questionnaire with information on age, gender, sexual orientation (straight and non-heterosexual), marital status and social class.

In terms of statistical procedures started by descriptive analysis  and posteriorly, as parti of the assumption that  validated scales were being used, except PTRA that was conceived for the present purposes and without the necessary methodological rigor, proposed, at least, the Reliability Analysis of the scales through the calculation of Cronbach’s Alpha.  The main focus of the study was the variable/PTRA score, and considering it as an outcome, I fulfilled the other secondary objectives through the T Test, ANOVA and Spearman Correlation.  Next, to close the analyses and verify the joint influence of independent variables on the time perception of affective relationships, I run a Multiple Linear Regression (MLR) of the independent variables selected by the stepwise forward selection method on the PTRA.

The aim of this report was to construct two multivariate models that explain frailty/vulnerability in the elderly. The sample consisted of about 550 individuals from Foz de Iguaçu and followed all methodological procedures for its design and definition of the sample size.

As the dependent variables  were ordinal, with three gradations each, we chose to develop two ordinal logistic models (ordered Logit),  to obtain i) the odds ratio =OR for each of the independent variables significant at the level of 10% ii) and the respective marginal effects of the typical case for better interpretation of the results. The two final models were evaluated in terms of robustness and adequacy: i) parallel line tests in order to verify whether we can use a single coefficient (OR) for the three categories; ii) heteroscedasticity test to verify the need for robust estimates; iii) link test to examine potential specification problems, such as omitted variables; iv) comparison of information criteria (AIC and BIC) between competing models, such as multinomial models; and v) the Hosmer-Lemeshow (HL) fit suitability measure. As there were several potential variables: some of them were significant in the bivariate tests and others entered the multivariate analyses for suspicions that could affect the outcome; we chose to use criteria for selecting variables to choose more parsimonious models: forward stepwise and stepwise backward.

Asset 17

Education

This statistical report sought to examine the understanding and practice of entrepreneurship, social entrepreneurship, applied entrepreneurship in basic education, and methodology for the development of projects by basic education professionals in Minas Gerais’ extreme south. To do this, it surveyed 71 teachers, specialists, principals, and vice-principals from the state and local basic education networks in five cities in Minas Gerais’ south.

Data analysis included a descriptive analysis of the profile factors and questionnaire items on entrepreneurial knowledge and practice, as well as project development. The reliability of the components (modules) anticipated from McDonald’s omega was then determined, allowing the scales’ scores (mean of the scores) to be used in the future studies. In this manner, the scores of the scales were compared to one another and through the profile factors.

The following non-parametric tests were used for this descriptive comparison: Spearman’s correlation (ρ), Mann-Whitney test (MW), Kruskal-Wallis test (KW), and Friedman’s test. Non-parametric tests were chosen since the scores are, by definition, ordinal variables. The association between the scores was assessed using Spearman’s correlation, the MW test when the categorical variable had two categories, and the KW test when the categorical variable had three or more categories. When the samples were independent, we utilized the MW and KW tests; when we had dependent (related) samples, we used the Friedman test instead of the KW.

Furthermore, because the last two tests were testing a joint hypothesis, of multiple comparisons, whose results indicated at least one difference but did not point to them, other post-hoc tests were required, such as the Dwass-Steel-Critchlow-Fligner (DSCF) tests, for multiple comparisons after the KW test, and the Durbin-Conover test, for multiple comparisons after the Friedman test. The studies were carried out using JASP and Jamovi software.

This was one of the projects that deepened the most in techniques for quantitative analysis of qualitative data (unstructured research), specifically, Natural Language Algorithms (NPL), Machine Learning, applied in the interpretation of texts (speeches). The objective was to examine the meaning of extractive amazon nuts for farmers in southern Roraima, integrating their experiences in the activity, cultural aspects and local ecological knowledge about the environmental conditions that may be related to the conservation of the species. In this sense, it was intended to relate the experience in the activity, cultural aspects and local ecological knowledge with the conservation of the species, from 13 unstructured interviews. They were recorded, transcribed, and then adapted to form the textual corpus of the following qualitative analyses.

Data analyses were performed using the software Interface de R pour les Analyses Multidimensionnelles de Textes et de Questionnaires (IRaMuTeQ). The software’s main objective is to analyze the structure and organization of discourse, making it possible to inform the relationships between lexical worlds that are most frequently enunciated by the research participants: Word Cloud, in order to group the words and organize them graphically according to their relevance;  Similitude Analysis, which makes it possible to identify the occurrences between words and their result brings indications of the connection between words, facilitating the understanding of the analyzed textual corpus;  Analysis of Specificities, seeking verification of differences in evocations (considering the frequency of incidence of words and their hypergeometric indexes/ χ2) among participants with different time of residence in the locality; and Descending Hierarchical Classification (DHC), for the recognition of the dendrogram with emerging classes, where the higher the χ², the more associated the word with the class,  and disregarding the words with χ² < 3.84 (p < 0.05).

In this project was one of the longest I participated, involving data collection for three years, which originated two comprehensive studies: i)  investigation of the relationship between general physical fitness, body image and strength training in students from the 6th to 9th grade of elementary school;  and ii) create a coefficient of physical fitness based on the Projeto Esporte Brasil (PROESP); measure body catexe in relation to gender, age, and grade throughout the study; and correlate the coefficient of fitness, body catexe, life habits, school performance and attendance. I participated in the project from the beginning, with help in the design of the proposal, assistance for the preparation of the questionnaire, and later, performing a phenomenal work of cleaning, organization and structuring of the database.

The design of the almost-experiment was as follows: 1)  information al collect at six different moments (T1, T2, T3, T4, T5 and T6);  and 2) five large blocks of variables; a) profile/identification variables; b) physical fitness, based on the manual of tests and evaluation of the Projeto Esporte Brasil 2016; c) life habits, a questionnaire developed for the purposes of the research; d) body image, through the Secord and Jourard Body Catexe Scale (1954); e) school performance, through the collection of information of grades and attendances of students in the discipline of physical education and all others (general); with several missing values.

Due to the design of the experiment and the need to configure it to make it feasible from the point of view of the research objectives, I adopted several strategy s after the description of the data (sample profile design): 1) the reliability of the measurements;   2)  how there was, strictly speaking, no precedence in the design of the near-experiment for causality inference, I tested if there were significant differences between the two distinct moments through the McNemar and Wilcoxon tests and between groups through the Mann-Whitney Test, Fisher’s Exact and Spearman’s Correlation; 3)  Exploratory Factor analysis  in order to reduce the size and create a coefficient of aptitude  and development of  a Generalized Estimating Equation (GEE) model to validate the research objectives

In this project I validated several scales on the relationship of parents with their children’s school in terms of child discipline, family impact, gratitude, school impact, employee impact and respect.  This was a cross-section study with information from about 1,000 students from 4 secondary schools in São Paulo.

Given the knowledge of the psychometric properties of the scales, I advanced only in Confirmatory Factor Analysis (CFA), and later, I related the scores in terms of Nonparametric Tests. The sponsor did not have a conceptual model for working structural models.

The objective of this project was to analyze the effectiveness of the Problem-Solving Strategy (ERP) as a teaching methodology based on Ausubel’s Theory of Significant Learning and Talízina’s teaching direction. For this, about 80 students of the 1st grade of high school of a military school in the state of Roraima, distributed in four classes, in the afternoon shift, aged between 15 and 16 years, were submitted to this methodology (experimental group) and the traditional methodology (control group) to develop the contents of function, related function and quadratic function.

Initially, the reliability of the measurement instrument was analyzed. In the present research, as the diagnostic test  before (_PRE) and after (_POS) the formation of the content  was possible to  calculate the test-retest accuracy for  each of the contents, as well as, from the set of items (problems in each dimension) of the tests, compute the internal consistency (Cronbach’s Alpha) of the instrument in each of the contents. In a second moment, I did a Descriptive Analysis, to finally, in inferential terms, perform Multiple Regressions with the scores of the tests before as control covariates.

This project proposed to evaluate a causal model, relating individual factors and student performance, their attitude towards THE and this modality in general, in the context of higher education. The instrument was applied  in a federal education center and allowed to generate analyses that could contribute to the discussion of information technology in the use of higher education and its role.

In methodological terms, as it was a primary collection, through a validated instrument, after evaluating the missing values and outliers, I examined the reliability of the scales (Cronbach’s Alpha) and performed several linear regressions to test the hypotheses. It was, clearly,  a problem of Structural Equation Modeling (SEM), however, the sponsor asked not to move forward in this direction, due to the limitations of their competencies

In this project, I investigated the perception that high school students present about fungi, having as research instrument a questionnaire  where students answered values from 1 to 4, according to a Likert scale (1 = totally disagree, …, 4 = totally agree), or leave blank if he did not know the statement, about his knowledge of fungi.

I used the Classical Test Theory (TCT) approach to analyze the items (selection of items that will be kept in the test). In practical terms, each item was analyzed according to its validity in skill tests: i) difficulty (proportion of correct answers or p); ii) discrimination (D-index and biserial-point correlations); and iii) reliability based on internal consistency between items (Cronbach’s Alpha). As I adopted an  internal criterion to validate the items, using the total score for criterion validity, I also performed  a Principal Component Analysis (PCA) to verify if the test presents a single one-dimensional trait (one-dimensional structure): knowledge on the fungi theme. With the items selected to make up the test in question, compute the total score for each individual and examine if there is any difference in score between school, shift and sex. For the last two, I used the Mann-Whitney test, and in the case of the first, because it is a variable with more than two categories, the Kruskal-Wallis test, with subsequent multiple comparations.

The statistical analysis carried out in this project had the general objective of evaluating the attitudes of teachers at the public school system of the Federal District in relation to inclusion, as well as analyzing the beliefs of self-efficacy that are developed by this same sample.

In methodological terms, I followed the following steps: i) I started from the hypothesis that the two scales of the research were not widely used in several contexts in Brazil, and therefore lacked the evaluation of their psychometric properties, at least through Exploratory Factor Analysis (EFA);  ii) after the decision of the composition of the factors derived in the previous step test hey, via nonparametric tests (Spearman correlation, Mann-Whitney test and Kruskal-Wallis test), the scores of the scales against socio professional variables .

The present project aimed to evaluate the understanding of students of the training course in oil and gas operation technician on scientific and technical contents related to fluid mechanics, knowledge considered of outstanding importance for the practice of this professional in the context of technological and productive processes present in the oil and gas industry,  where theoretical knowledge and practical knowledge are found in coping with concrete problems.

The available sample was small and without a conceptual model a priori,  and therefore   I chose to make the evaluation of the intervention and evolution of the knowledge of oil and gas operation technicians through  the score of each of the tests and applied a nonparametric variance analysis for related samples [Friedman test (χ2) ] for evaluation of performance in the three distinct moments, Kendall’s concordance coefficient (W) for elucidations about the reliability of the tests, the Wilcoxon Test (Z) for two related samples, when the comparison of performance was only in two moments (Pre-course and Retention) and Spearman correlation (ρ)  to tell us about reliability only in those two moments. In addition, to know whether the overall performance (in the three distinct moments or two moments) would be related to the age of the student and the experience prior to I applied the Spearman’s Correlation and the Mann-Whitney Test (U).

The general objective of this project was to construct an evaluative instrument aimed at investigating the conception of teachers about the purposes of the Cooperative Games in School Physical Education. Specifically, we sought to  evaluate the evaluators’ opinions,  in the context of the pilot sample, precede  a test-retest, and in the final sample,  evaluation of the internal consistency of the accuracy of the cooperative games questionnaire.

In this case, in addition to proposing the judges’ questionnaire, I used the Kappa coefficient to evaluate the agreement between the evaluators, and the evaluations of the test-retest agreements did by: i) among the items: D of Somers; and ii) between the scores: Spearman correlation ; in order to assess the accuracy of the questionnaire. To evaluate the internal consistency, that is, the homogeneity of the sample of test items, use Cronbach’s Alpha.

This project required relatively simple statistical analyses. In it I evaluated the influence of karate and taekwondo practice for the development of people with intellectual disabilities, with information before and after some psychological aspects.  Since I did not know the origin of the s scale s to measure these psychological aspects, and only the final scores of  the measurements were available, I assumed full reliability of the measurement models.

The problem dealt with two related samples (paired) with the same individual being observed before and after treatment (the subject being his own control) and   for the crossings of nominal variables I applied the Marginal Homogeneity Test. For the crossings between ordinal variables, I used the Signal Test and the Wilcoxon Test.

In this project I analyzed the learning profile of doctors, nurses and dentists on the use of the Internet in daily professional life.  Through a questionnaire constructed for the proper purposes,  approximately 300 health professionals (physicians, nurses and dentists)  were applied to describe and relate the variables raised.  I worked from the conception of the questionnaire, assistance in  data collection, from a proposal of online collection form and analysis of the pre-test of the questionnaire (pilot).

The analyses carried out in this work were carried out in accordance with the objectives pursued. Thus, to fulfill the objective of i) understanding the profile of health professionals and identifying their formal and informal learning process on the web, perform a Descriptive Analysis (frequency tables and descriptive statistics) of the collected variables; and for ii) to map similarities and differences between professional segments with regard to the learning profile,  applying ordered nonparametric binary tests, especially Chi-square, Kruskal-Wallis and Spearman’s correction.

In this project I reviewed the calculations that one Multivariate Analysis, which was initially computed in view of a sample of 70 individuals, but later reduced to a sample of 40 individuals.  As the analyses were incorrect, I ran others in view of the characteristics of the data.

The procedure adopted to the problem was  a Multiple Linear Regression with stepwise method, widely used when seeking to maximize the explanation of dependent variable and not sure of the theoretical model a priori.  Additionally, because it is a small sample, we recommend a Univariate Analysis, mainly because there were several categorical variables that should be restructured, and thus be able to run some bivariate tests.  Finally, from another variable of the research, which was not the main one, I simplified some ANOVA models and logistic regressions.

In this project I worked since the data collection (Saeb microdata), passing later through the cleaning, structuring and organization of the database, and then  a stratified sampling procedure.  The statistical analysis performed in  this project aims or, primarily, describe and analyze the intensity and trends of inequality of educational opportunities in Brazil. In specific terms, I investigated the relationships between social origins and proficiency of students of the 5th and 9th grades of elementary school and students of the 3rd year of high school, considering the performances of students in Mathematics and Portuguese Language year 2013.

Due to the high number of questions in the Saeb, at first the sponsor made a  choice of variables justified by the literature and informed me those considered important for the purpose of studying. Analyzing these questions, it was verified that for the application of the chosen statistical techniques the vast majority of the questions could be grouped into type 0 and 1 categorical, or they should be transformed into binary (dummy) or ordinal variables (1, 2, 3 etc.). After this new categorization of the variables,  a Principal Component Analysis was performed with the sole objective of reducing the variables so that they can be used as substitutes for the observed variables, aiming at reducing the number of study items in subsequent analyses. As the aim of the study was to analyze the influence of social, demographic and environmental issues that the student is inserted in his school performance, essentially his proficiency in Mathematics and Portuguese Language, the technique employed was Multiple Linear Regression with a stepwise method of selection of variables.

This project was one of the coolest and  it gave me more pleasure in executing. This was a study with two groups of users, 10 blind and 10 psychics, where  usability tests were performed on six websites: two sites by e-commerce category, education and entertainment, each category having a responsive website and a non-responsive website. Six tasks were applied on each website where the completion times of each task were extracted or the time it took the user to give up, if it occurred. During the execution of the tasks was recorded the interaction with the website and also extracted the degree of disorientation (lostness) of the individual. At the end of each access to the website, the user answered a questionnaire with objective questions, mainly with Likert scales between one and five, in order to better evaluate each of the websites with regard to ease of use, demand for effort, organization of information, etc. At the end of the research, a questionnaire was applied to the profile of these individuals (gender, schooling, age, contact with computer and time of use per week).

The data was arranged so that the sites (one to six) were the units of analysis. As I had the information of who completed or did not complete the task successfully but   did not have the information of how long an individual who gave up would complete the task successfully, I came across a  clear situation of censored data, and so I proposed  a Survival Analysis (SA).  I estimated the survival function and/or failure rate of an event, as well as compare this function and/or failure rate between distinct groups (Kaplan-Meier procedure) and evaluate the relationship between explanatory variables (Cox regression).  In addition to these techniques, I used other nonparametric tests, such as the Mann-Whitney (U) test and Spearman’s Correlation (ρ) to evaluate some of the hypotheses of the research.  I also used Factor Analysis to obtain estimates of the factors, so that they can be used as substitutes for the observed variables, aiming at reducing the number of study items in subsequent analyses.

This project aimed to investigate the affectivity in the environment of physical education classes in a municipal network of Curitiba. In specific terms, the  sponsor also intended to: 1) identify the personal characteristics of the students; 2) identify the emotions and feelings of students in the environment of physical education classes; 3) to verify the students’ esteem regarding the environment of physical education classes;  and 4) correlate physical activity practice with positive/negative esteem.

In terms of analyses to address the situation, I surveyed the feelings (words) that are most repeated (frequency) to support the qualitative part. Subsequently, the personal data of physical education students were described (frequency and descriptive statistics).  In one of the questions, I performed a Factor Analysis to explore the factors that exist in the scale of emotions and feelings in the environment of physical education classes. With the factors obtained in the previous analysis, I present cross-tables and their respective hypothesis tests (Mann-Whitney Test and Kruskal-Wallis Test) against some variables of personal data (school, gender and age) and physical activity outside the physical education environment (type of physical activity, amount of weekly practice and duration of practice).

In this project I investigated the emotions and feelings of athletes and coaches during basketball training in the school environment. The complete research  involved a semi-structured questionnaire, with  two main sections: 1) sociodemographic questions, which involved questions to identify the sociodemographic profile of athletes and coaches; and 2) effective map, whose questions analyzed in this report involved a Likert-type scale and the frequencies of feelings.

At first, all questions related to the quantitative part of the research were described (frequency and descriptive statistics) for each of the analysis groups: athletes and coaches. Later, I did a reliability analysis (Cronbach’s Alpha) of the scale of emotions and feelings in relation to the sport. In a third moment, the factors that passed the reliability test of the previous stage were crossed with some of the sociodemographic variables and information about the sport described in the first stage. In this third stage, the cross-table results of the nonparametric tests (Mann-Whitney Test, Kruskal-Wallis Test and Spearman Correlation) of the factors resulting from the previous stage against the variables of the first section were discussed.

Asset 18

Epidemiology

This project aims to examine whether there is any spatial-time pattern of leishmaniasis incidence (LTA) in the state of Maranhão between 2007 and 2020. It was a cross section information (time series), with annual periodicity, with no missing data, with observations of the LTA cases and population of the municipalities of Maranhão between 2007 and 2020, in addition to the information of geolocations (latitude/longitude) of the municipalities.

The researcher indicated to use Spatial Scan Statistics, which consists of performing spatial or space-time inspection of diseases, to evaluate the existence of statistically significant clusters. This approach, based on Machine Learning techniques, or specifically Geographical Analysis Machine, tests whether a disease is randomly distributed in space, time, or space, and time, and assesses the statistical significance of disease cluster alarms. From a scan (circle, ellipse, radius, square, or any other shape) on a georeferenced map (centroids of municipalities, for example) the number of events within that scan is counted to decide whether this number of events is intensified. It usually simulates several situations (10,000 iterations, for example) until an agglomeration of events can reach a given population (50%, for example) and the most significant clusters are presented, from a statistic, such as the ratio of the log likelihood (LLR) of the evaluated cluster and events outside the cluster. More specifically, I use a model based on the discrete Poisson distribution, where the number of events in a geographic area is distributed by Poisson, according to an underlying population known at risk. I use the Software SatScan v.10.1 to search the space-time clusters, following the indications most recommended in the literature, for the construction of maps and illustrations of clusters.

In this project I used time series techniques, one of the statistical approaches that I like to work with the most, and I got an excellent model adjustment. The objective was to examine the series of incidence rate of leishmaniasis in the state of Maranhão between 2007 and 2020. It was a single time series with monthly periodicity, without missing data and that seeks to measure cases of leishmaniasis per 100,000 habitants.

In this sense, we sought to estimate SARIMA (Seasonal Autoregressive Integrated and Moving-Average) models in the context of the research, according to the classic Methodology of Box-Jenkins. In general, we obtained excellent success in the work, because i) the series proved well-behaved in statistical terms (without outliers, approximately normal, etc.) and ii) the chosen model fit the data well. The results were presented in view of the four steps of the Box-Jenkins methodology: Identification, Estimation, Diagnosis and Forecasting; seeking to highlight the arguments in the most segregated, informal and visual way possible.

In this project I helped identify the pattern of spatial and temporal distribution of mortality rates due to ischemic heart diseases in the State of Tocantins between 2008 and 2017.  This was an ecological study, with a time series between 2008 and 2017, whose area analysis units were the 139 municipalities of the State of Tocantins grouped into eight health regions. The data were obtained from the Mortality Information System of the Informatics Department of the Unified Health System and the Brazilian Institute of Geography and Statistics. The analysis allowed us to study the spatial distribution of mortality, testing the hypothesis of spatial dependence, based on an Exploratory Spatial Data Analysis (local and global Moran index)

The general objective of this project was to evaluate the relationship between the incidence of leptospirosis and rainfall in Brazil between 2005 and 2014. For this I had two databases: 1) annual distribution of rainfall and leptospirosis in the period 2005 to 2014 for Brazilian capitals, except Porto Velho; and 2) pooled data from Brazilian capitals (except Porto Velho) for the period 2005-2014 with rainfall average information in the period, total cases of leptospirosis in the period and other control variables collected from the 2010 census.

Inters of modeling, two characteristics of the data direct me to the techniques: 1) first of all, the number  of people-time counting data, i.e., the response variable is the incidence rate of leptospirosis, and by definition it is recommended that poisson family distributions be recommended to adjust the models. I used the techniques of Generalized Mixed Models (GMM) with adjustments of four distributions-link (Poisson-log, Normal-identity, Gamma-log and Binomial Negative-log), because the data presented a little of each of these characteristics; 2 )and m second, I had pooled data  (stacked) of virtually all Brazilian capitals: i) in the first data set had 26 independent observations of variables that were taken the means (rainfall), the sum in the period (incidence of leptospirosis) or a single observation in 2010 (control variables taken from the census); and ii) in the second data set had   rainfall-dependent information and incidence of leptospirosis year on year (2005-2014) for the 26 capitals, i.e. 26×10=260 observations in a typical panel data set or repeated measures. In this sense, I took the hypothesis that the object of study are the capitals, as it had practically all  , by definition I used Fixed  Effects, however, if the object of study are large cities in Brazil, and the capitals are some of the available samples, I thought  to run the GMM models by Random Effects.

In this project I evaluated the association between socio-economic and demographic indicators, and the incidence of dengue in the years 2011/2012 in the municipality of Rio de Janeiro.  Dengue cases reported in the 160 neighborhoods of the municipality of Rio de Janeiro in 2011 and 2012 were analyzed with laboratory confirmation of the disease, extracted from the state database of the Notifiable Diseases Notification System (SINAN) and socio-economic and demographic data, referring to the 2010 Demographic Census, obtained through IBGE.

After the description of the research variables (descriptive statistics and correlations),I used Multiple Linear Regression with a stepwise method to attack the problem, due to  the lack of theory that unreprinted my studies, small sample and high correlation between the variables.

In this project I evaluated the lethality  of dengue vis-à-vis the control interventions implemented during the fight against the epidemic 2007-2008 in the Metropolitan Region of Rio de Janeiro. This was an ecological study of time series based on secondary data on cases and weekly hospitalizations for dengue between January 2001 and December 2011. The data were extracted from the databases of the Information and Diseases of Notifications (SINAN), Mortality Information System (SIM) and DATASUS.

I had in hand the weekly time series, from January 2001 to December 2011, of the number of dengue cases and the number of hospitalizations for dengue in the metropolitan region of Rio de Janeiro.  From this information I created the variable lethality.  From there, I followed the following methodological approach: i) adjustment of a univariate time series model with conditional heteroscedasticity (GARCH) for the lethality variable, in order to identify the stochastic process that generates the series; and ii) inclusion of two variable dummies  in the model to identify/test the presence of structural breaks in the generating process, due to the 1) weeks considered epidemic and 2) the 10 weeks of control interventions implemented during the fight against the epidemic 2007-2008. The procedure for using dummies to capture some stylized fact, known in a certain period of time,  is within what is called Intervention Analysis in Time Series.

In this project I evaluated the severity of dengue during the control interventions implemented in the fight against the 2007-2008 epidemic in the Metropolitan Region of Rio de Janeiro. This was an ecological study of time series based on secondary data on cases and weekly hospitalizations for dengue between January 2003 and December 2011. The data were extracted from the SINAN, DATASUS and INMET databases.

From the variables of  severities to adjust two autoregressive models with distributed lags (ARDD).  In the first case, instead of adjusting a single-variate model of time series following the Box-Jenkins methodology, a multivariate model of time series was adjusted that explains the  variable severity with dummies  in the periods of intervention.  For the second model, the dummies variables were excluded from  the analysis because we chose  to test structural breaks (or structural stability of parameters): Chow test, when structural breakage is known, and Quandt-Andrews, Bai-Perron and CUMSUM test tests  (Least Recursive Squares), when structural breakage is not known, to make inferences about the stability of the parameters of the fitted model.

Biology

In this project, somewhat simple, due to the sample size (n = 18), and it is a semi-structured questionnaire, I had the main objective to examine the meaning of being an extractive amazon nut for farmers in southern Roraima, integrating their experiences in the activity, cultural aspects and local ecological knowledge about the environmental conditions that may have relationship with the conservation of the species. Thus, it is intended to relate experience in activity, cultural aspects and local ecological knowledge with the conservation of the species.

Given the level of measurement (mostly nominal) and lack of variability (concentration of responses) of the variables that can be treated quantitatively, added to the sample size, the statistical treatment options were limited to a Descriptive Analysis and Bivariate Analysis.  For this purposes, I adopted nonparametric tests for independent samples, mainly due to the sample size, and perhaps, due to the levels of measurement of the variables: i) Spearman Correlation (ρ), between the variables scaling/ordinal; and ii) Mann-Whitney Test (U), between the nominal variables with only two classes and the scalars/nominal variables.

In this report I evaluated the quality of water and sediment from the João Leite river reservoir in the rainy and dry season.  These were repeated measures.

In terms of strategy, in addition to descriptive analysis, I performed a Principal Component Analysis (PCA): with the objective of reducing each of the dimensions; and adjusted Mixed Generalized Linear Models (GzMM).  The GzMM models considered were with normal distribution for variable response and identity binding function, considered due  to the flexibility to include factors to control the relationships and because  it was a  typical case of repeated samples (random effects).

In this project I compared areas of riparian forest that were under the effect of flooding: i) Experimental area (ES): it was planted with a native forest; ii) Degraded area (SD): without vegetation; and iii) Preserved Area (PS): a preserved native forest. Each area had 3 zones that constituted areas of greater and lesser influence of the river. Three samples were collected in each of the zones with regard to: 1) functional criteria of aggregation and porosity of the soil; 2) Quantitative formation of AC. humic and luvidos and their qualitative composition (aliphatic and aromatic). The main idea was to understand whether the recovered area looks like the degraded one based on the available variables.

Performed a Descriptive Analysis of variables through tables and graphs and two strategies to meet the objectives: 1) Principal Component Analysis (PCA) with the research variables and test the component found through   a General Linear Model (GLM) with  two factors: i) the sites as a fixed factor; and ii) zones as a random factor, there is a view that three samples were collected, possibly correlated by definition, in each of the zonas and in the three sites; and 2)  a Cluster Analysis (CA) to explore how the sites and zones would be aggregated regarding the attributes collected.

The present project was an experiment that aimed to evaluate the productivity/performance differences of lettuce cultivated with hydroponic system and irrigated with different brackish water treatments. The results were evaluated from three experiments and five treatments (factors) where information of 12 variables was collected: i) four related to lettuce mass (fresh mass of shoots, shoot dry mass, stem mass and moisture); ii) three others related to lettuce leaves (number of leaves, chlorophyll, and leaf area); and iii) five related to the minerals present in lettuce (nitrate, sodium, potassium, iron and phosphorus).

Theodate as a strategy is the application of three MANOVA’s of two factors: i) one for mass analysis (four dependent variables); ii) another for leaf analysis (three dependent variables); and finally, iii) one for mineral analysis (five dependent variables). Thus, given the three levels of the experiments and five of the treatments, that is, a maximum of five groups  had 27 observations/group that guarantied would detect a very  large effect size (at least!) to reach the recommended statistical power of 0.80. Simultaneously with MANOVA’s performed multiple comparations adjusted by Bonferroni to accurately point out where the significant differences were.

In this project, we analyze the relationship between the degree of ecological knowledge and: i) age; (ii) the level of education; iii) the time of living in the community; and iv) the time of experience in the “pre-village” period of 51 indigenous peoples.  These four factors measured in years, and two others of ecological knowledge (repertoire and competencies) measured through standardized tests (scores).  I used a Path Analysis to evaluate the research problem, because the present study expected that the age of the individual would affect their time of schooling, community residence and “pre-village” period, even before influencing their ecological knowledge (repertoire and competencies), which characterizes Mediation Analysis.

In this project I analyzed the relationship between the number of herbivore animals, number of propagules and environmental variables (precipitation, salinity, pH and temperature) in mangrove plants of the species Laguncularia racemosa, Avicennia germinans and Rhizophora mangle in São Luiz/MA (Alumar Port Terminal).

It had information organized over time, 36 months between 2009 and 2012, and due to the extreme lack of normality of the variables, tested by the Kolmogorov-Sminorv test, using nonparametric tests in the bivariate analysis: i) Kruskal-Wallis test to test the relationship between the three species and the number of herbivores and propagules; ii) Spearman correlation  between the number of herbivores and propagules against other environmental variables (salinity, pH, precipitation and temperature); iii) Kendall’s partial post-order correlation coefficient between the number of herbivores and the other environmental variables, controlling the number of propagules. In the case of statistical modeling, the problem required an approach of Structural Equations Modeling (SEM), since I had at least four endogenous variables in the system and, even with the problems found in the database (small sample, with lack of multivariate normality and excess of outliers), I chose to adjust a Path Analysis model by  (GLS) with  Bootstrap resampling (with a view to alleviate potential problems).

In this project I analyzed and sought to understand the spatial distribution and perceived effects of the invasion of Acacia mangium Willd (Fabaceae) in three communities (Moskow, São Domingos and Malacheta), located in the Moskow Indigenous Land, in the northern Brazilian Amazon.  The aim was to answer the following research questions: What are the main habitats that occur invasive plants? What is the abundance of invasive plants in the swiddens? Does the distance from commercial planting influence the density of invasive plants in indigenous swiddens?  What are the effects of the invasion of A. mangium on the work in the indigenous gardens?

Together with a Descriptive Analysis of the  variables, to answer most of the questions raised, we performed the Mann-Whitney test to answer whether there is a relationship between perceived effect on the swiddens and time of coexistence with the plants one of the questions raised. However, the main technique implemented was Multiple Linear Regression to elucidate the relationship between the distance from the swidden plantation and the density of the acacia trees.

In this project I analyzed the ecological knowledge of Wapichana and Macuxi indigenous peoples about the invasion of Acacia mangium Willd (Fabaceae) in savannah ecosystems (Lavrado), verified in three communities (Moskow, São Domingos and Malacheta), located in the Moskow Indigenous Land, in the northern Brazilian Amazon. To meet the objectives, I had semi-structured research conducted through a questionnaire in a sample of about 100 individuals from these communities. The questionnaire was composed of sociodemographic and other information related to the knowledge of Acacia mangium Wildenow (Acacia), such as plant recognition, plant site, plant knowledge time and  perception of changes in community routine.

Together with a descriptive analysis of the  variables I performed Chi-square tests to verify the relationship between two nominal variables and the Mann-Whitney and Kruskal-Wallis tests  to verify if there are differences between a scalar variable and a nominal variable: the first in the case of the nominal variable presented only two categories and the second in the case of it presents more than two categories. Among the variables scaling calculate scan the Coefficient of Spearman Correlation.